Overview

Dataset statistics

Number of variables19
Number of observations45463
Missing cells73825
Missing cells (%)8.5%
Duplicate rows16
Duplicate rows (%)< 0.1%
Total size in memory6.6 MiB
Average record size in memory152.0 B

Variable types

Categorical11
Numeric8

Alerts

Dataset has 16 (< 0.1%) duplicate rowsDuplicates
belongs_to_collection has a high cardinality: 1695 distinct valuesHigh cardinality
genres has a high cardinality: 4065 distinct valuesHigh cardinality
original_language has a high cardinality: 89 distinct valuesHigh cardinality
overview has a high cardinality: 44306 distinct valuesHigh cardinality
production_companies has a high cardinality: 22671 distinct valuesHigh cardinality
production_countries has a high cardinality: 2390 distinct valuesHigh cardinality
release_date has a high cardinality: 17333 distinct valuesHigh cardinality
spoken_languages has a high cardinality: 1841 distinct valuesHigh cardinality
tagline has a high cardinality: 20283 distinct valuesHigh cardinality
title has a high cardinality: 42277 distinct valuesHigh cardinality
budget is highly overall correlated with revenue and 1 other fieldsHigh correlation
revenue is highly overall correlated with budget and 1 other fieldsHigh correlation
return is highly overall correlated with budget and 1 other fieldsHigh correlation
original_language is highly imbalanced (67.4%)Imbalance
production_countries is highly imbalanced (58.4%)Imbalance
spoken_languages is highly imbalanced (62.0%)Imbalance
status is highly imbalanced (97.0%)Imbalance
belongs_to_collection has 40972 (90.1%) missing valuesMissing
genres has 2442 (5.4%) missing valuesMissing
overview has 954 (2.1%) missing valuesMissing
spoken_languages has 3955 (8.7%) missing valuesMissing
tagline has 25051 (55.1%) missing valuesMissing
popularity is highly skewed (γ1 = 29.22545384)Skewed
return is highly skewed (γ1 = 138.4620841)Skewed
release_year is highly skewed (γ1 = -20.43091917)Skewed
overview is uniformly distributedUniform
tagline is uniformly distributedUniform
title is uniformly distributedUniform
budget has 36573 (80.4%) zerosZeros
revenue has 38055 (83.7%) zerosZeros
runtime has 1558 (3.4%) zerosZeros
vote_average has 2998 (6.6%) zerosZeros
return has 40082 (88.2%) zerosZeros

Reproduction

Analysis started2023-05-15 18:12:03.688469
Analysis finished2023-05-15 18:13:34.749072
Duration1 minute and 31.06 seconds
Software versionydata-profiling vv4.1.2
Download configurationconfig.json

Variables

belongs_to_collection
Categorical

HIGH CARDINALITY  MISSING 

Distinct1695
Distinct (%)37.7%
Missing40972
Missing (%)90.1%
Memory size355.3 KiB
The Bowery Boys
 
29
Totò Collection
 
27
James Bond Collection
 
26
Zatôichi: The Blind Swordsman
 
26
The Carry On Collection
 
25
Other values (1690)
4358 

Length

Max length54
Median length43
Mean length23.856379
Min length3

Characters and Unicode

Total characters107139
Distinct characters166
Distinct categories12 ?
Distinct scripts7 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique390 ?
Unique (%)8.7%

Sample

1st rowToy Story Collection
2nd rowGrumpy Old Men Collection
3rd rowFather of the Bride Collection
4th rowJames Bond Collection
5th rowBalto Collection

Common Values

ValueCountFrequency (%)
The Bowery Boys 29
 
0.1%
Totò Collection 27
 
0.1%
James Bond Collection 26
 
0.1%
Zatôichi: The Blind Swordsman 26
 
0.1%
The Carry On Collection 25
 
0.1%
Pokémon Collection 22
 
< 0.1%
Charlie Chan (Sidney Toler) Collection 21
 
< 0.1%
Godzilla (Showa) Collection 16
 
< 0.1%
Uuno Turhapuro 15
 
< 0.1%
Dragon Ball Z (Movie) Collection 15
 
< 0.1%
Other values (1685) 4269
 
9.4%
(Missing) 40972
90.1%

Length

2023-05-15T15:13:35.379828image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
collection 3746
25.3%
the 1146
 
7.7%
of 230
 
1.6%
series 147
 
1.0%
139
 
0.9%
trilogy 87
 
0.6%
and 84
 
0.6%
a 62
 
0.4%
man 62
 
0.4%
in 56
 
0.4%
Other values (2407) 9033
61.1%

Most occurring characters

ValueCountFrequency (%)
o 11121
 
10.4%
e 10460
 
9.8%
10302
 
9.6%
l 10207
 
9.5%
i 7563
 
7.1%
n 7410
 
6.9%
t 6492
 
6.1%
c 4851
 
4.5%
C 4477
 
4.2%
a 4462
 
4.2%
Other values (156) 29794
27.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 81164
75.8%
Uppercase Letter 13893
 
13.0%
Space Separator 10302
 
9.6%
Other Punctuation 576
 
0.5%
Close Punctuation 335
 
0.3%
Open Punctuation 335
 
0.3%
Decimal Number 321
 
0.3%
Dash Punctuation 162
 
0.2%
Other Letter 37
 
< 0.1%
Final Punctuation 9
 
< 0.1%
Other values (2) 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 11121
13.7%
e 10460
12.9%
l 10207
12.6%
i 7563
9.3%
n 7410
9.1%
t 6492
8.0%
c 4851
 
6.0%
a 4462
 
5.5%
r 3873
 
4.8%
s 2588
 
3.2%
Other values (69) 12137
15.0%
Uppercase Letter
ValueCountFrequency (%)
C 4477
32.2%
T 1527
 
11.0%
S 1064
 
7.7%
B 682
 
4.9%
M 631
 
4.5%
A 509
 
3.7%
D 507
 
3.6%
H 462
 
3.3%
P 432
 
3.1%
G 417
 
3.0%
Other values (33) 3185
22.9%
Other Letter
ValueCountFrequency (%)
3
 
8.1%
3
 
8.1%
3
 
8.1%
3
 
8.1%
3
 
8.1%
3
 
8.1%
3
 
8.1%
3
 
8.1%
3
 
8.1%
2
 
5.4%
Other values (4) 8
21.6%
Other Punctuation
ValueCountFrequency (%)
. 172
29.9%
' 107
18.6%
: 99
17.2%
, 79
13.7%
& 52
 
9.0%
! 35
 
6.1%
/ 21
 
3.6%
? 4
 
0.7%
* 4
 
0.7%
3
 
0.5%
Decimal Number
ValueCountFrequency (%)
1 80
24.9%
9 64
19.9%
3 54
16.8%
0 51
15.9%
2 21
 
6.5%
8 13
 
4.0%
5 12
 
3.7%
7 11
 
3.4%
6 10
 
3.1%
4 5
 
1.6%
Close Punctuation
ValueCountFrequency (%)
) 330
98.5%
] 5
 
1.5%
Open Punctuation
ValueCountFrequency (%)
( 330
98.5%
[ 5
 
1.5%
Dash Punctuation
ValueCountFrequency (%)
- 160
98.8%
2
 
1.2%
Space Separator
ValueCountFrequency (%)
10302
100.0%
Final Punctuation
ValueCountFrequency (%)
9
100.0%
Modifier Letter
ValueCountFrequency (%)
3
100.0%
Other Number
ValueCountFrequency (%)
½ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 94643
88.3%
Common 12045
 
11.2%
Cyrillic 414
 
0.4%
Hiragana 15
 
< 0.1%
Hangul 10
 
< 0.1%
Katakana 9
 
< 0.1%
Han 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 11121
11.8%
e 10460
11.1%
l 10207
10.8%
i 7563
 
8.0%
n 7410
 
7.8%
t 6492
 
6.9%
c 4851
 
5.1%
C 4477
 
4.7%
a 4462
 
4.7%
r 3873
 
4.1%
Other values (70) 23727
25.1%
Cyrillic
ValueCountFrequency (%)
л 48
 
11.6%
и 41
 
9.9%
о 37
 
8.9%
к 30
 
7.2%
е 27
 
6.5%
я 25
 
6.0%
а 17
 
4.1%
ц 16
 
3.9%
К 16
 
3.9%
р 14
 
3.4%
Other values (32) 143
34.5%
Common
ValueCountFrequency (%)
10302
85.5%
) 330
 
2.7%
( 330
 
2.7%
. 172
 
1.4%
- 160
 
1.3%
' 107
 
0.9%
: 99
 
0.8%
1 80
 
0.7%
, 79
 
0.7%
9 64
 
0.5%
Other values (20) 322
 
2.7%
Hiragana
ValueCountFrequency (%)
3
20.0%
3
20.0%
3
20.0%
3
20.0%
3
20.0%
Hangul
ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%
Katakana
ValueCountFrequency (%)
3
33.3%
3
33.3%
3
33.3%
Han
ValueCountFrequency (%)
3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 106425
99.3%
Cyrillic 414
 
0.4%
None 246
 
0.2%
Hiragana 15
 
< 0.1%
Punctuation 14
 
< 0.1%
Katakana 12
 
< 0.1%
Hangul 10
 
< 0.1%
CJK 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 11121
 
10.4%
e 10460
 
9.8%
10302
 
9.7%
l 10207
 
9.6%
i 7563
 
7.1%
n 7410
 
7.0%
t 6492
 
6.1%
c 4851
 
4.6%
C 4477
 
4.2%
a 4462
 
4.2%
Other values (67) 29080
27.3%
Cyrillic
ValueCountFrequency (%)
л 48
 
11.6%
и 41
 
9.9%
о 37
 
8.9%
к 30
 
7.2%
е 27
 
6.5%
я 25
 
6.0%
а 17
 
4.1%
ц 16
 
3.9%
К 16
 
3.9%
р 14
 
3.4%
Other values (32) 143
34.5%
None
ValueCountFrequency (%)
é 45
18.3%
ä 40
16.3%
ô 35
14.2%
ò 28
11.4%
ö 19
7.7%
ı 14
 
5.7%
ó 14
 
5.7%
í 9
 
3.7%
İ 4
 
1.6%
á 4
 
1.6%
Other values (19) 34
13.8%
Punctuation
ValueCountFrequency (%)
9
64.3%
3
 
21.4%
2
 
14.3%
Hiragana
ValueCountFrequency (%)
3
20.0%
3
20.0%
3
20.0%
3
20.0%
3
20.0%
CJK
ValueCountFrequency (%)
3
100.0%
Katakana
ValueCountFrequency (%)
3
25.0%
3
25.0%
3
25.0%
3
25.0%
Hangul
ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%

budget
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1223
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4224578.8
Minimum0
Maximum3.8 × 108
Zeros36573
Zeros (%)80.4%
Negative0
Negative (%)0.0%
Memory size355.3 KiB
2023-05-15T15:13:37.207789image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile25000000
Maximum3.8 × 108
Range3.8 × 108
Interquartile range (IQR)0

Descriptive statistics

Standard deviation17424133
Coefficient of variation (CV)4.1244662
Kurtosis66.765616
Mean4224578.8
Median Absolute Deviation (MAD)0
Skewness7.125326
Sum1.9206203 × 1011
Variance3.036004 × 1014
MonotonicityNot monotonic
2023-05-15T15:13:38.581007image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 36573
80.4%
5000000 286
 
0.6%
10000000 259
 
0.6%
20000000 243
 
0.5%
2000000 242
 
0.5%
15000000 226
 
0.5%
3000000 223
 
0.5%
25000000 206
 
0.5%
1000000 197
 
0.4%
30000000 190
 
0.4%
Other values (1213) 6818
 
15.0%
ValueCountFrequency (%)
0 36573
80.4%
1 25
 
0.1%
2 14
 
< 0.1%
3 9
 
< 0.1%
4 8
 
< 0.1%
5 8
 
< 0.1%
6 5
 
< 0.1%
7 4
 
< 0.1%
8 5
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
380000000 1
 
< 0.1%
300000000 1
 
< 0.1%
280000000 1
 
< 0.1%
270000000 1
 
< 0.1%
260000000 3
 
< 0.1%
258000000 1
 
< 0.1%
255000000 1
 
< 0.1%
250000000 10
< 0.1%
245000000 2
 
< 0.1%
237000000 1
 
< 0.1%

genres
Categorical

HIGH CARDINALITY  MISSING 

Distinct4065
Distinct (%)9.4%
Missing2442
Missing (%)5.4%
Memory size355.3 KiB
Drama
5000 
Comedy
3621 
Documentary
 
2723
Drama, Romance
 
1301
Comedy, Drama
 
1135
Other values (4060)
29241 

Length

Max length80
Median length65
Mean length16.46461
Min length3

Characters and Unicode

Total characters708324
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2362 ?
Unique (%)5.5%

Sample

1st rowAnimation, Comedy, Family
2nd rowAdventure, Fantasy, Family
3rd rowRomance, Comedy
4th rowComedy, Drama, Romance
5th rowComedy

Common Values

ValueCountFrequency (%)
Drama 5000
 
11.0%
Comedy 3621
 
8.0%
Documentary 2723
 
6.0%
Drama, Romance 1301
 
2.9%
Comedy, Drama 1135
 
2.5%
Horror 974
 
2.1%
Comedy, Romance 930
 
2.0%
Comedy, Drama, Romance 593
 
1.3%
Drama, Comedy 532
 
1.2%
Horror, Thriller 528
 
1.2%
Other values (4055) 25684
56.5%
(Missing) 2442
 
5.4%

Length

2023-05-15T15:13:39.592806image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
drama 20265
21.4%
comedy 13182
13.9%
thriller 7624
 
8.0%
romance 6735
 
7.1%
action 6596
 
6.9%
horror 4673
 
4.9%
crime 4307
 
4.5%
documentary 3932
 
4.1%
adventure 3496
 
3.7%
science 3049
 
3.2%
Other values (12) 21051
22.2%

Most occurring characters

ValueCountFrequency (%)
r 69119
 
9.8%
a 61851
 
8.7%
e 55810
 
7.9%
m 53126
 
7.5%
51889
 
7.3%
o 48562
 
6.9%
, 48073
 
6.8%
i 39699
 
5.6%
n 35704
 
5.0%
y 28529
 
4.0%
Other values (20) 215962
30.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 512685
72.4%
Uppercase Letter 95677
 
13.5%
Space Separator 51889
 
7.3%
Other Punctuation 48073
 
6.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 69119
13.5%
a 61851
12.1%
e 55810
10.9%
m 53126
10.4%
o 48562
9.5%
i 39699
7.7%
n 35704
7.0%
y 28529
5.6%
c 28008
5.5%
t 26228
 
5.1%
Other values (7) 66049
12.9%
Uppercase Letter
ValueCountFrequency (%)
D 24197
25.3%
C 17489
18.3%
A 12027
12.6%
F 9754
10.2%
T 8391
 
8.8%
R 6735
 
7.0%
H 6071
 
6.3%
M 4832
 
5.1%
S 3049
 
3.2%
W 2365
 
2.5%
Space Separator
ValueCountFrequency (%)
51889
100.0%
Other Punctuation
ValueCountFrequency (%)
, 48073
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 608362
85.9%
Common 99962
 
14.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 69119
11.4%
a 61851
 
10.2%
e 55810
 
9.2%
m 53126
 
8.7%
o 48562
 
8.0%
i 39699
 
6.5%
n 35704
 
5.9%
y 28529
 
4.7%
c 28008
 
4.6%
t 26228
 
4.3%
Other values (18) 161726
26.6%
Common
ValueCountFrequency (%)
51889
51.9%
, 48073
48.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 708324
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 69119
 
9.8%
a 61851
 
8.7%
e 55810
 
7.9%
m 53126
 
7.5%
51889
 
7.3%
o 48562
 
6.9%
, 48073
 
6.8%
i 39699
 
5.6%
n 35704
 
5.0%
y 28529
 
4.0%
Other values (20) 215962
30.5%

id
Real number (ℝ)

Distinct45433
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean108359.92
Minimum2
Maximum469172
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size355.3 KiB
2023-05-15T15:13:40.537414image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile5421.1
Q126449.5
median60003
Q3157328
95-th percentile358641.1
Maximum469172
Range469170
Interquartile range (IQR)130878.5

Descriptive statistics

Standard deviation112460.75
Coefficient of variation (CV)1.0378445
Kurtosis0.54820057
Mean108359.92
Median Absolute Deviation (MAD)44525
Skewness1.2797219
Sum4.926367 × 109
Variance1.264742 × 1010
MonotonicityNot monotonic
2023-05-15T15:13:41.428420image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
141971 3
 
< 0.1%
5511 2
 
< 0.1%
132641 2
 
< 0.1%
10991 2
 
< 0.1%
168538 2
 
< 0.1%
4912 2
 
< 0.1%
18440 2
 
< 0.1%
15028 2
 
< 0.1%
14788 2
 
< 0.1%
265189 2
 
< 0.1%
Other values (45423) 45442
> 99.9%
ValueCountFrequency (%)
2 1
< 0.1%
3 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
14 1
< 0.1%
15 1
< 0.1%
16 1
< 0.1%
ValueCountFrequency (%)
469172 1
< 0.1%
468707 1
< 0.1%
468343 1
< 0.1%
467731 1
< 0.1%
465044 1
< 0.1%
464819 1
< 0.1%
464207 1
< 0.1%
464111 1
< 0.1%
463906 1
< 0.1%
463800 1
< 0.1%

original_language
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct89
Distinct (%)0.2%
Missing11
Missing (%)< 0.1%
Memory size355.3 KiB
en
32269 
fr
 
2438
it
 
1529
ja
 
1350
de
 
1080
Other values (84)
6786 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters90904
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)< 0.1%

Sample

1st rowen
2nd rowen
3rd rowen
4th rowen
5th rowen

Common Values

ValueCountFrequency (%)
en 32269
71.0%
fr 2438
 
5.4%
it 1529
 
3.4%
ja 1350
 
3.0%
de 1080
 
2.4%
es 994
 
2.2%
ru 826
 
1.8%
hi 508
 
1.1%
ko 444
 
1.0%
zh 409
 
0.9%
Other values (79) 3605
 
7.9%

Length

2023-05-15T15:13:42.076255image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
en 32269
71.0%
fr 2438
 
5.4%
it 1529
 
3.4%
ja 1350
 
3.0%
de 1080
 
2.4%
es 994
 
2.2%
ru 826
 
1.8%
hi 508
 
1.1%
ko 444
 
1.0%
zh 409
 
0.9%
Other values (79) 3605
 
7.9%

Most occurring characters

ValueCountFrequency (%)
e 34598
38.1%
n 32978
36.3%
r 3636
 
4.0%
f 2839
 
3.1%
i 2391
 
2.6%
t 2252
 
2.5%
a 1841
 
2.0%
s 1654
 
1.8%
j 1351
 
1.5%
d 1325
 
1.5%
Other values (16) 6039
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 90904
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 34598
38.1%
n 32978
36.3%
r 3636
 
4.0%
f 2839
 
3.1%
i 2391
 
2.6%
t 2252
 
2.5%
a 1841
 
2.0%
s 1654
 
1.8%
j 1351
 
1.5%
d 1325
 
1.5%
Other values (16) 6039
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 90904
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 34598
38.1%
n 32978
36.3%
r 3636
 
4.0%
f 2839
 
3.1%
i 2391
 
2.6%
t 2252
 
2.5%
a 1841
 
2.0%
s 1654
 
1.8%
j 1351
 
1.5%
d 1325
 
1.5%
Other values (16) 6039
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90904
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 34598
38.1%
n 32978
36.3%
r 3636
 
4.0%
f 2839
 
3.1%
i 2391
 
2.6%
t 2252
 
2.5%
a 1841
 
2.0%
s 1654
 
1.8%
j 1351
 
1.5%
d 1325
 
1.5%
Other values (16) 6039
 
6.6%

overview
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct44306
Distinct (%)99.5%
Missing954
Missing (%)2.1%
Memory size355.3 KiB
No overview found.
 
133
No Overview
 
7
 
5
Recovering from a nail gun shot to the head and 13 months of coma, doctor Pekka Valinta starts to unravel the mystery of his past, still suffering from total amnesia.
 
3
A few funny little novels about different aspects of life.
 
3
Other values (44301)
44358 

Length

Max length1000
Median length785
Mean length323.34281
Min length1

Characters and Unicode

Total characters14391665
Distinct characters429
Distinct categories25 ?
Distinct scripts13 ?
Distinct blocks21 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44247 ?
Unique (%)99.4%

Sample

1st rowLed by Woody, Andy's toys live happily in his room until Andy's birthday brings Buzz Lightyear onto the scene. Afraid of losing his place in Andy's heart, Woody plots against Buzz. But when circumstances separate Buzz and Woody from their owner, the duo eventually learns to put aside their differences.
2nd rowWhen siblings Judy and Peter discover an enchanted board game that opens the door to a magical world, they unwittingly invite Alan -- an adult who's been trapped inside the game for 26 years -- into their living room. Alan's only hope for freedom is to finish the game, which proves risky as all three find themselves running from giant rhinoceroses, evil monkeys and other terrifying creatures.
3rd rowA family wedding reignites the ancient feud between next-door neighbors and fishing buddies John and Max. Meanwhile, a sultry Italian divorcée opens a restaurant at the local bait shop, alarming the locals who worry she'll scare the fish away. But she's less interested in seafood than she is in cooking up a hot time with Max.
4th rowCheated on, mistreated and stepped on, the women are holding their breath, waiting for the elusive "good man" to break a string of less-than-stellar lovers. Friends and confidants Vannah, Bernie, Glo and Robin talk it all out, determined to find a better way to breathe.
5th rowJust when George Banks has recovered from his daughter's wedding, he receives the news that she's pregnant ... and that George's wife, Nina, is expecting too. He was planning on selling their home, but that's a plan that -- like George -- will have to change with the arrival of both a grandchild and a kid of his own.

Common Values

ValueCountFrequency (%)
No overview found. 133
 
0.3%
No Overview 7
 
< 0.1%
5
 
< 0.1%
Recovering from a nail gun shot to the head and 13 months of coma, doctor Pekka Valinta starts to unravel the mystery of his past, still suffering from total amnesia. 3
 
< 0.1%
A few funny little novels about different aspects of life. 3
 
< 0.1%
No movie overview available. 3
 
< 0.1%
Adaptation of the Jane Austen novel. 3
 
< 0.1%
King Lear, old and tired, divides his kingdom among his daughters, giving great importance to their protestations of love for him. When Cordelia, youngest and most honest, refuses to idly flatter the old man in return for favor, he banishes her and turns for support to his remaining daughters. But Goneril and Regan have no love for him and instead plot to take all his power from him. In a parallel, Lear's loyal courtier Gloucester favors his illegitimate son Edmund after being told lies about his faithful son Edgar. Madness and tragedy befall both ill-starred fathers. 3
 
< 0.1%
Adventurer Allan Quartermain leads an expedition into uncharted African territory in an attempt to locate an explorer who went missing during his search for the fabled diamond mines of King Solomon. 2
 
< 0.1%
While holidaying in the French Alps, a Swedish family deals with acts of cowardliness as an avalanche breaks out. 2
 
< 0.1%
Other values (44296) 44345
97.5%
(Missing) 954
 
2.1%

Length

2023-05-15T15:13:42.891613image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 138357
 
5.6%
a 99037
 
4.0%
and 75407
 
3.1%
to 73442
 
3.0%
of 69723
 
2.8%
in 48228
 
2.0%
is 36550
 
1.5%
his 36210
 
1.5%
with 23933
 
1.0%
her 21518
 
0.9%
Other values (97181) 1830620
74.6%

Most occurring characters

ValueCountFrequency (%)
2410599
16.7%
e 1366174
 
9.5%
a 942275
 
6.5%
t 936476
 
6.5%
i 853105
 
5.9%
o 831419
 
5.8%
n 824147
 
5.7%
s 769185
 
5.3%
r 745638
 
5.2%
h 601821
 
4.2%
Other values (419) 4110826
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11170206
77.6%
Space Separator 2410637
 
16.8%
Uppercase Letter 391748
 
2.7%
Other Punctuation 313382
 
2.2%
Decimal Number 42329
 
0.3%
Dash Punctuation 36848
 
0.3%
Close Punctuation 10112
 
0.1%
Open Punctuation 10090
 
0.1%
Final Punctuation 4560
 
< 0.1%
Initial Punctuation 884
 
< 0.1%
Other values (15) 869
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1366174
12.2%
a 942275
 
8.4%
t 936476
 
8.4%
i 853105
 
7.6%
o 831419
 
7.4%
n 824147
 
7.4%
s 769185
 
6.9%
r 745638
 
6.7%
h 601821
 
5.4%
l 479700
 
4.3%
Other values (142) 2820266
25.2%
Uppercase Letter
ValueCountFrequency (%)
A 42831
 
10.9%
T 36041
 
9.2%
S 31203
 
8.0%
M 24000
 
6.1%
B 23750
 
6.1%
C 22837
 
5.8%
H 19463
 
5.0%
W 18685
 
4.8%
I 16837
 
4.3%
D 16347
 
4.2%
Other values (77) 139754
35.7%
Other Letter
ValueCountFrequency (%)
6
 
4.8%
6
 
4.8%
5
 
4.0%
4
 
3.2%
3
 
2.4%
3
 
2.4%
3
 
2.4%
3
 
2.4%
2
 
1.6%
2
 
1.6%
Other values (76) 88
70.4%
Other Punctuation
ValueCountFrequency (%)
, 133694
42.7%
. 124991
39.9%
' 31173
 
9.9%
" 11693
 
3.7%
: 3306
 
1.1%
? 2765
 
0.9%
; 2496
 
0.8%
! 1546
 
0.5%
/ 769
 
0.2%
& 455
 
0.1%
Other values (12) 494
 
0.2%
Nonspacing Mark
ValueCountFrequency (%)
ి 4
12.1%
́ 4
12.1%
̈ 3
9.1%
3
9.1%
3
9.1%
3
9.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
Other values (4) 5
15.2%
Decimal Number
ValueCountFrequency (%)
1 9770
23.1%
0 8292
19.6%
9 6422
15.2%
2 4265
10.1%
5 2446
 
5.8%
8 2384
 
5.6%
3 2346
 
5.5%
4 2181
 
5.2%
7 2135
 
5.0%
6 2088
 
4.9%
Spacing Mark
ValueCountFrequency (%)
11
40.7%
4
 
14.8%
3
 
11.1%
3
 
11.1%
ि 2
 
7.4%
2
 
7.4%
1
 
3.7%
ி 1
 
3.7%
Dash Punctuation
ValueCountFrequency (%)
- 35321
95.9%
885
 
2.4%
633
 
1.7%
5
 
< 0.1%
4
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
® 45
70.3%
14
 
21.9%
° 2
 
3.1%
¦ 2
 
3.1%
1
 
1.6%
Math Symbol
ValueCountFrequency (%)
~ 20
46.5%
+ 12
27.9%
= 6
 
14.0%
| 4
 
9.3%
1
 
2.3%
Open Punctuation
ValueCountFrequency (%)
( 10036
99.5%
[ 51
 
0.5%
{ 2
 
< 0.1%
1
 
< 0.1%
Currency Symbol
ValueCountFrequency (%)
$ 318
96.4%
£ 10
 
3.0%
1
 
0.3%
1
 
0.3%
Space Separator
ValueCountFrequency (%)
2410599
> 99.9%
  36
 
< 0.1%
  2
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 10060
99.5%
] 50
 
0.5%
} 2
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
3850
84.4%
691
 
15.2%
» 19
 
0.4%
Initial Punctuation
ValueCountFrequency (%)
673
76.1%
193
 
21.8%
« 18
 
2.0%
Control
ValueCountFrequency (%)
106
96.4%
’ 3
 
2.7%
 1
 
0.9%
Modifier Symbol
ValueCountFrequency (%)
´ 25
65.8%
` 12
31.6%
¯ 1
 
2.6%
Format
ValueCountFrequency (%)
31
60.8%
­ 20
39.2%
Other Number
ValueCountFrequency (%)
½ 8
50.0%
¹ 8
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 19
100.0%
Line Separator
ValueCountFrequency (%)
7
100.0%
Paragraph Separator
ValueCountFrequency (%)
2
100.0%
Modifier Letter
ValueCountFrequency (%)
ʼ 2
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11556722
80.3%
Common 2829524
 
19.7%
Cyrillic 4587
 
< 0.1%
Greek 648
 
< 0.1%
Devanagari 77
 
< 0.1%
Telugu 30
 
< 0.1%
Hiragana 20
 
< 0.1%
Tamil 19
 
< 0.1%
Han 10
 
< 0.1%
Hangul 9
 
< 0.1%
Other values (3) 19
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1366174
11.8%
a 942275
 
8.2%
t 936476
 
8.1%
i 853105
 
7.4%
o 831419
 
7.2%
n 824147
 
7.1%
s 769185
 
6.7%
r 745638
 
6.5%
h 601821
 
5.2%
l 479700
 
4.2%
Other values (132) 3206782
27.7%
Common
ValueCountFrequency (%)
2410599
85.2%
, 133694
 
4.7%
. 124991
 
4.4%
- 35321
 
1.2%
' 31173
 
1.1%
" 11693
 
0.4%
) 10060
 
0.4%
( 10036
 
0.4%
1 9770
 
0.3%
0 8292
 
0.3%
Other values (71) 43895
 
1.6%
Cyrillic
ValueCountFrequency (%)
о 470
 
10.2%
е 404
 
8.8%
а 373
 
8.1%
н 323
 
7.0%
и 299
 
6.5%
т 265
 
5.8%
р 240
 
5.2%
с 218
 
4.8%
в 173
 
3.8%
л 161
 
3.5%
Other values (46) 1661
36.2%
Greek
ValueCountFrequency (%)
α 60
 
9.3%
ο 55
 
8.5%
τ 43
 
6.6%
η 36
 
5.6%
ι 36
 
5.6%
ν 34
 
5.2%
ε 31
 
4.8%
ρ 31
 
4.8%
π 30
 
4.6%
ς 30
 
4.6%
Other values (33) 262
40.4%
Devanagari
ValueCountFrequency (%)
11
 
14.3%
6
 
7.8%
6
 
7.8%
5
 
6.5%
4
 
5.2%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
Other values (21) 30
39.0%
Hiragana
ValueCountFrequency (%)
4
20.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (7) 7
35.0%
Telugu
ValueCountFrequency (%)
ి 4
13.3%
3
10.0%
3
10.0%
3
10.0%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
1
 
3.3%
Other values (6) 6
20.0%
Tamil
ValueCountFrequency (%)
3
15.8%
2
10.5%
2
10.5%
2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Other values (3) 3
15.8%
Han
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Hangul
ValueCountFrequency (%)
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Thai
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Arabic
ValueCountFrequency (%)
م 2
50.0%
ہ 1
25.0%
ت 1
25.0%
Inherited
ValueCountFrequency (%)
́ 4
57.1%
̈ 3
42.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14373653
99.9%
Punctuation 7281
 
0.1%
None 5933
 
< 0.1%
Cyrillic 4587
 
< 0.1%
Devanagari 77
 
< 0.1%
Telugu 30
 
< 0.1%
Hiragana 20
 
< 0.1%
Tamil 19
 
< 0.1%
Letterlike Symbols 14
 
< 0.1%
CJK 10
 
< 0.1%
Other values (11) 41
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2410599
16.8%
e 1366174
 
9.5%
a 942275
 
6.6%
t 936476
 
6.5%
i 853105
 
5.9%
o 831419
 
5.8%
n 824147
 
5.7%
s 769185
 
5.4%
r 745638
 
5.2%
h 601821
 
4.2%
Other values (82) 4092814
28.5%
Punctuation
ValueCountFrequency (%)
3850
52.9%
885
 
12.2%
691
 
9.5%
673
 
9.2%
633
 
8.7%
304
 
4.2%
193
 
2.7%
31
 
0.4%
7
 
0.1%
5
 
0.1%
Other values (4) 9
 
0.1%
None
ValueCountFrequency (%)
é 1552
26.2%
ä 294
 
5.0%
á 293
 
4.9%
ö 250
 
4.2%
í 244
 
4.1%
è 209
 
3.5%
ü 178
 
3.0%
ı 165
 
2.8%
ó 164
 
2.8%
ç 158
 
2.7%
Other values (141) 2426
40.9%
Cyrillic
ValueCountFrequency (%)
о 470
 
10.2%
е 404
 
8.8%
а 373
 
8.1%
н 323
 
7.0%
и 299
 
6.5%
т 265
 
5.8%
р 240
 
5.2%
с 218
 
4.8%
в 173
 
3.8%
л 161
 
3.5%
Other values (46) 1661
36.2%
Letterlike Symbols
ValueCountFrequency (%)
14
100.0%
Devanagari
ValueCountFrequency (%)
11
 
14.3%
6
 
7.8%
6
 
7.8%
5
 
6.5%
4
 
5.2%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
Other values (21) 30
39.0%
Telugu
ValueCountFrequency (%)
ి 4
13.3%
3
10.0%
3
10.0%
3
10.0%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
1
 
3.3%
Other values (6) 6
20.0%
Hiragana
ValueCountFrequency (%)
4
20.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (7) 7
35.0%
Diacriticals
ValueCountFrequency (%)
́ 4
57.1%
̈ 3
42.9%
Alphabetic PF
ValueCountFrequency (%)
4
100.0%
Tamil
ValueCountFrequency (%)
3
15.8%
2
10.5%
2
10.5%
2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Other values (3) 3
15.8%
Hangul
ValueCountFrequency (%)
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Arabic
ValueCountFrequency (%)
م 2
50.0%
ہ 1
25.0%
ت 1
25.0%
Thai
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Modifier Letters
ValueCountFrequency (%)
ʼ 2
100.0%
Number Forms
ValueCountFrequency (%)
2
100.0%
CJK
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Math Operators
ValueCountFrequency (%)
1
100.0%
Katakana
ValueCountFrequency (%)
1
100.0%
Currency Symbols
ValueCountFrequency (%)
1
50.0%
1
50.0%
Specials
ValueCountFrequency (%)
1
100.0%

popularity
Real number (ℝ)

Distinct43757
Distinct (%)96.3%
Missing3
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean2.9214783
Minimum0
Maximum547.4883
Zeros66
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size355.3 KiB
2023-05-15T15:13:43.847834image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.018921
Q10.38594775
median1.127685
Q33.6789023
95-th percentile11.061568
Maximum547.4883
Range547.4883
Interquartile range (IQR)3.2929545

Descriptive statistics

Standard deviation6.0054143
Coefficient of variation (CV)2.055608
Kurtosis1925.684
Mean2.9214783
Median Absolute Deviation (MAD)0.9672565
Skewness29.225454
Sum132810.41
Variance36.065001
MonotonicityNot monotonic
2023-05-15T15:13:44.631061image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 66
 
0.1%
1 × 10-656
 
0.1%
0.000308 43
 
0.1%
0.00022 40
 
0.1%
0.000844 38
 
0.1%
0.001177 38
 
0.1%
0.000578 38
 
0.1%
0.002001 28
 
0.1%
0.003013 21
 
< 0.1%
0.001393 19
 
< 0.1%
Other values (43747) 45073
99.1%
ValueCountFrequency (%)
0 66
0.1%
1 × 10-656
0.1%
2 × 10-66
 
< 0.1%
3 × 10-66
 
< 0.1%
4 × 10-65
 
< 0.1%
5 × 10-61
 
< 0.1%
6 × 10-64
 
< 0.1%
7 × 10-61
 
< 0.1%
8 × 10-66
 
< 0.1%
9 × 10-62
 
< 0.1%
ValueCountFrequency (%)
547.488298 1
< 0.1%
294.337037 1
< 0.1%
287.253654 1
< 0.1%
228.032744 1
< 0.1%
213.849907 1
< 0.1%
187.860492 1
< 0.1%
185.330992 1
< 0.1%
185.070892 1
< 0.1%
183.870374 1
< 0.1%
154.801009 1
< 0.1%
Distinct22671
Distinct (%)49.9%
Missing0
Missing (%)0.0%
Memory size355.3 KiB
unknown
11878 
Metro-Goldwyn-Mayer (MGM)
 
742
Warner Bros.
 
540
Paramount Pictures
 
505
Twentieth Century Fox Film Corporation
 
439
Other values (22666)
31359 

Length

Max length609
Median length476
Mean length32.483558
Min length2

Characters and Unicode

Total characters1476800
Distinct characters294
Distinct categories17 ?
Distinct scripts6 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20303 ?
Unique (%)44.7%

Sample

1st rowPixar Animation Studios
2nd rowTriStar Pictures, Teitler Film, Interscope Communications
3rd rowWarner Bros., Lancaster Gate
4th rowTwentieth Century Fox Film Corporation
5th rowSandollar Productions, Touchstone Pictures

Common Values

ValueCountFrequency (%)
unknown 11878
 
26.1%
Metro-Goldwyn-Mayer (MGM) 742
 
1.6%
Warner Bros. 540
 
1.2%
Paramount Pictures 505
 
1.1%
Twentieth Century Fox Film Corporation 439
 
1.0%
Universal Pictures 320
 
0.7%
RKO Radio Pictures 247
 
0.5%
Columbia Pictures Corporation 207
 
0.5%
Columbia Pictures 146
 
0.3%
Mosfilm 145
 
0.3%
Other values (22661) 30294
66.6%

Length

2023-05-15T15:13:45.577960image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
unknown 11878
 
6.3%
films 9457
 
5.0%
pictures 9267
 
4.9%
productions 9061
 
4.8%
film 6679
 
3.5%
entertainment 5156
 
2.7%
corporation 2190
 
1.2%
company 1769
 
0.9%
warner 1478
 
0.8%
bros 1411
 
0.7%
Other values (18620) 131247
69.2%

Most occurring characters

ValueCountFrequency (%)
144139
 
9.8%
n 125624
 
8.5%
i 106960
 
7.2%
o 97188
 
6.6%
e 94664
 
6.4%
r 83560
 
5.7%
t 83455
 
5.7%
a 77156
 
5.2%
s 62681
 
4.2%
u 55632
 
3.8%
Other values (284) 545741
37.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1070357
72.5%
Uppercase Letter 199004
 
13.5%
Space Separator 144144
 
9.8%
Other Punctuation 45110
 
3.1%
Decimal Number 4348
 
0.3%
Dash Punctuation 4331
 
0.3%
Open Punctuation 4330
 
0.3%
Close Punctuation 4329
 
0.3%
Math Symbol 662
 
< 0.1%
Other Letter 140
 
< 0.1%
Other values (7) 45
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 125624
11.7%
i 106960
10.0%
o 97188
9.1%
e 94664
8.8%
r 83560
 
7.8%
t 83455
 
7.8%
a 77156
 
7.2%
s 62681
 
5.9%
u 55632
 
5.2%
l 51269
 
4.8%
Other values (102) 232168
21.7%
Other Letter
ValueCountFrequency (%)
9
 
6.4%
8
 
5.7%
6
 
4.3%
5
 
3.6%
5
 
3.6%
5
 
3.6%
5
 
3.6%
5
 
3.6%
4
 
2.9%
3
 
2.1%
Other values (62) 85
60.7%
Uppercase Letter
ValueCountFrequency (%)
P 27882
14.0%
F 26364
13.2%
C 20589
 
10.3%
M 13363
 
6.7%
S 11914
 
6.0%
E 9750
 
4.9%
A 9550
 
4.8%
T 9357
 
4.7%
B 9006
 
4.5%
G 7812
 
3.9%
Other values (52) 53417
26.8%
Other Punctuation
ValueCountFrequency (%)
, 37364
82.8%
. 5671
 
12.6%
& 765
 
1.7%
/ 645
 
1.4%
' 451
 
1.0%
" 133
 
0.3%
! 36
 
0.1%
% 18
 
< 0.1%
: 9
 
< 0.1%
@ 5
 
< 0.1%
Other values (6) 13
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 1035
23.8%
1 712
16.4%
0 641
14.7%
3 556
12.8%
4 481
11.1%
9 205
 
4.7%
6 195
 
4.5%
5 178
 
4.1%
8 173
 
4.0%
7 172
 
4.0%
Open Punctuation
ValueCountFrequency (%)
( 4320
99.8%
[ 9
 
0.2%
1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 4319
99.8%
] 9
 
0.2%
1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
144139
> 99.9%
  5
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 4329
> 99.9%
2
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
+ 661
99.8%
| 1
 
0.2%
Other Symbol
ValueCountFrequency (%)
° 23
92.0%
2
 
8.0%
Final Punctuation
ValueCountFrequency (%)
» 3
50.0%
3
50.0%
Other Number
ValueCountFrequency (%)
² 1
50.0%
½ 1
50.0%
Control
ValueCountFrequency (%)
4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%
Initial Punctuation
ValueCountFrequency (%)
« 3
100.0%
Format
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1268958
85.9%
Common 207297
 
14.0%
Cyrillic 373
 
< 0.1%
Hangul 115
 
< 0.1%
Greek 31
 
< 0.1%
Han 26
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 125624
 
9.9%
i 106960
 
8.4%
o 97188
 
7.7%
e 94664
 
7.5%
r 83560
 
6.6%
t 83455
 
6.6%
a 77156
 
6.1%
s 62681
 
4.9%
u 55632
 
4.4%
l 51269
 
4.0%
Other values (99) 430769
33.9%
Hangul
ValueCountFrequency (%)
9
 
7.8%
8
 
7.0%
6
 
5.2%
5
 
4.3%
5
 
4.3%
5
 
4.3%
5
 
4.3%
5
 
4.3%
4
 
3.5%
3
 
2.6%
Other values (43) 60
52.2%
Common
ValueCountFrequency (%)
144139
69.5%
, 37364
 
18.0%
. 5671
 
2.7%
- 4329
 
2.1%
( 4320
 
2.1%
) 4319
 
2.1%
2 1035
 
0.5%
& 765
 
0.4%
1 712
 
0.3%
+ 661
 
0.3%
Other values (37) 3982
 
1.9%
Cyrillic
ValueCountFrequency (%)
и 34
 
9.1%
о 28
 
7.5%
а 26
 
7.0%
л 22
 
5.9%
н 20
 
5.4%
м 19
 
5.1%
т 17
 
4.6%
ь 16
 
4.3%
с 16
 
4.3%
е 16
 
4.3%
Other values (36) 159
42.6%
Greek
ValueCountFrequency (%)
ο 3
 
9.7%
ν 3
 
9.7%
τ 2
 
6.5%
ρ 2
 
6.5%
Κ 2
 
6.5%
ι 2
 
6.5%
η 2
 
6.5%
λ 2
 
6.5%
Ε 2
 
6.5%
κ 1
 
3.2%
Other values (10) 10
32.3%
Han
ValueCountFrequency (%)
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
1
 
3.8%
1
 
3.8%
1
 
3.8%
Other values (9) 9
34.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1470570
99.6%
None 5711
 
0.4%
Cyrillic 373
 
< 0.1%
Hangul 113
 
< 0.1%
CJK 26
 
< 0.1%
Punctuation 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
144139
 
9.8%
n 125624
 
8.5%
i 106960
 
7.3%
o 97188
 
6.6%
e 94664
 
6.4%
r 83560
 
5.7%
t 83455
 
5.7%
a 77156
 
5.2%
s 62681
 
4.3%
u 55632
 
3.8%
Other values (77) 539511
36.7%
None
ValueCountFrequency (%)
é 3176
55.6%
ó 416
 
7.3%
á 317
 
5.6%
í 173
 
3.0%
ü 154
 
2.7%
ñ 150
 
2.6%
ô 140
 
2.5%
ä 137
 
2.4%
è 136
 
2.4%
ö 132
 
2.3%
Other values (76) 780
 
13.7%
Cyrillic
ValueCountFrequency (%)
и 34
 
9.1%
о 28
 
7.5%
а 26
 
7.0%
л 22
 
5.9%
н 20
 
5.4%
м 19
 
5.1%
т 17
 
4.6%
ь 16
 
4.3%
с 16
 
4.3%
е 16
 
4.3%
Other values (36) 159
42.6%
Hangul
ValueCountFrequency (%)
9
 
8.0%
8
 
7.1%
6
 
5.3%
5
 
4.4%
5
 
4.4%
5
 
4.4%
5
 
4.4%
5
 
4.4%
4
 
3.5%
3
 
2.7%
Other values (42) 58
51.3%
Punctuation
ValueCountFrequency (%)
3
42.9%
2
28.6%
1
 
14.3%
1
 
14.3%
CJK
ValueCountFrequency (%)
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
1
 
3.8%
1
 
3.8%
1
 
3.8%
Other values (9) 9
34.6%

production_countries
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct2390
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size355.3 KiB
United States of America
17851 
unknown
6285 
United Kingdom
2238 
France
 
1654
Japan
 
1356
Other values (2385)
16079 

Length

Max length237
Median length167
Mean length17.380815
Min length4

Characters and Unicode

Total characters790184
Distinct characters53
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1765 ?
Unique (%)3.9%

Sample

1st rowUnited States of America
2nd rowUnited States of America
3rd rowUnited States of America
4th rowUnited States of America
5th rowUnited States of America

Common Values

ValueCountFrequency (%)
United States of America 17851
39.3%
unknown 6285
 
13.8%
United Kingdom 2238
 
4.9%
France 1654
 
3.6%
Japan 1356
 
3.0%
Italy 1030
 
2.3%
Canada 840
 
1.8%
Germany 749
 
1.6%
India 735
 
1.6%
Russia 735
 
1.6%
Other values (2380) 11990
26.4%

Length

2023-05-15T15:13:46.472008image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
united 25275
20.2%
states 21154
16.9%
of 21153
16.9%
america 21153
16.9%
unknown 6285
 
5.0%
kingdom 4094
 
3.3%
france 3940
 
3.2%
germany 2260
 
1.8%
italy 2169
 
1.7%
canada 1765
 
1.4%
Other values (178) 15827
12.7%

Most occurring characters

ValueCountFrequency (%)
e 80672
 
10.2%
79612
 
10.1%
t 72641
 
9.2%
a 70506
 
8.9%
n 66365
 
8.4%
i 58568
 
7.4%
o 35874
 
4.5%
d 34558
 
4.4%
r 32498
 
4.1%
m 28713
 
3.6%
Other values (43) 230177
29.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 602723
76.3%
Uppercase Letter 97599
 
12.4%
Space Separator 79612
 
10.1%
Other Punctuation 10250
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 80672
13.4%
t 72641
12.1%
a 70506
11.7%
n 66365
11.0%
i 58568
9.7%
o 35874
 
6.0%
d 34558
 
5.7%
r 32498
 
5.4%
m 28713
 
4.8%
c 26378
 
4.4%
Other values (16) 95950
15.9%
Uppercase Letter
ValueCountFrequency (%)
U 25376
26.0%
S 23842
24.4%
A 22395
22.9%
K 5221
 
5.3%
F 4335
 
4.4%
I 3588
 
3.7%
C 2594
 
2.7%
G 2473
 
2.5%
J 1664
 
1.7%
R 1308
 
1.3%
Other values (14) 4803
 
4.9%
Other Punctuation
ValueCountFrequency (%)
, 10245
> 99.9%
' 5
 
< 0.1%
Space Separator
ValueCountFrequency (%)
79612
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 700322
88.6%
Common 89862
 
11.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 80672
11.5%
t 72641
 
10.4%
a 70506
 
10.1%
n 66365
 
9.5%
i 58568
 
8.4%
o 35874
 
5.1%
d 34558
 
4.9%
r 32498
 
4.6%
m 28713
 
4.1%
c 26378
 
3.8%
Other values (40) 193549
27.6%
Common
ValueCountFrequency (%)
79612
88.6%
, 10245
 
11.4%
' 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 790184
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 80672
 
10.2%
79612
 
10.1%
t 72641
 
9.2%
a 70506
 
8.9%
n 66365
 
8.4%
i 58568
 
7.4%
o 35874
 
4.5%
d 34558
 
4.4%
r 32498
 
4.1%
m 28713
 
3.6%
Other values (43) 230177
29.1%

release_date
Categorical

Distinct17333
Distinct (%)38.2%
Missing87
Missing (%)0.2%
Memory size355.3 KiB
2008-01-01
 
136
2009-01-01
 
121
2007-01-01
 
118
2005-01-01
 
111
2006-01-01
 
101
Other values (17328)
44789 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters453760
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8570 ?
Unique (%)18.9%

Sample

1st row1995-10-30
2nd row1995-12-15
3rd row1995-12-22
4th row1995-12-22
5th row1995-02-10

Common Values

ValueCountFrequency (%)
2008-01-01 136
 
0.3%
2009-01-01 121
 
0.3%
2007-01-01 118
 
0.3%
2005-01-01 111
 
0.2%
2006-01-01 101
 
0.2%
2002-01-01 96
 
0.2%
2004-01-01 90
 
0.2%
2001-01-01 84
 
0.2%
2003-01-01 76
 
0.2%
1997-01-01 69
 
0.2%
Other values (17323) 44374
97.6%
(Missing) 87
 
0.2%

Length

2023-05-15T15:13:47.125152image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2008-01-01 136
 
0.3%
2009-01-01 121
 
0.3%
2007-01-01 118
 
0.3%
2005-01-01 111
 
0.2%
2006-01-01 101
 
0.2%
2002-01-01 96
 
0.2%
2004-01-01 90
 
0.2%
2001-01-01 84
 
0.2%
2003-01-01 76
 
0.2%
1997-01-01 69
 
0.2%
Other values (17323) 44374
97.8%

Most occurring characters

ValueCountFrequency (%)
0 97600
21.5%
- 90752
20.0%
1 84054
18.5%
2 52803
11.6%
9 39773
8.8%
3 15435
 
3.4%
8 15279
 
3.4%
6 15021
 
3.3%
5 14836
 
3.3%
7 14289
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 363008
80.0%
Dash Punctuation 90752
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 97600
26.9%
1 84054
23.2%
2 52803
14.5%
9 39773
11.0%
3 15435
 
4.3%
8 15279
 
4.2%
6 15021
 
4.1%
5 14836
 
4.1%
7 14289
 
3.9%
4 13918
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 90752
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 453760
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 97600
21.5%
- 90752
20.0%
1 84054
18.5%
2 52803
11.6%
9 39773
8.8%
3 15435
 
3.4%
8 15279
 
3.4%
6 15021
 
3.3%
5 14836
 
3.3%
7 14289
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 453760
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 97600
21.5%
- 90752
20.0%
1 84054
18.5%
2 52803
11.6%
9 39773
8.8%
3 15435
 
3.4%
8 15279
 
3.4%
6 15021
 
3.3%
5 14836
 
3.3%
7 14289
 
3.1%

revenue
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6863
Distinct (%)15.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11208609
Minimum0
Maximum2.7879651 × 109
Zeros38055
Zeros (%)83.7%
Negative0
Negative (%)0.0%
Memory size355.3 KiB
2023-05-15T15:13:47.877579image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile47798669
Maximum2.7879651 × 109
Range2.7879651 × 109
Interquartile range (IQR)0

Descriptive statistics

Standard deviation64330189
Coefficient of variation (CV)5.7393553
Kurtosis237.52606
Mean11208609
Median Absolute Deviation (MAD)0
Skewness12.266385
Sum5.0957698 × 1011
Variance4.1383732 × 1015
MonotonicityNot monotonic
2023-05-15T15:13:48.678472image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 38055
83.7%
12000000 20
 
< 0.1%
11000000 19
 
< 0.1%
10000000 19
 
< 0.1%
2000000 18
 
< 0.1%
6000000 17
 
< 0.1%
5000000 14
 
< 0.1%
500000 13
 
< 0.1%
8000000 13
 
< 0.1%
1 12
 
< 0.1%
Other values (6853) 7263
 
16.0%
ValueCountFrequency (%)
0 38055
83.7%
1 12
 
< 0.1%
2 3
 
< 0.1%
3 9
 
< 0.1%
4 4
 
< 0.1%
5 5
 
< 0.1%
6 2
 
< 0.1%
7 4
 
< 0.1%
8 5
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
2787965087 1
< 0.1%
2068223624 1
< 0.1%
1845034188 1
< 0.1%
1519557910 1
< 0.1%
1513528810 1
< 0.1%
1506249360 1
< 0.1%
1405403694 1
< 0.1%
1342000000 1
< 0.1%
1274219009 1
< 0.1%
1262886337 1
< 0.1%

runtime
Real number (ℝ)

Distinct353
Distinct (%)0.8%
Missing260
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean94.128199
Minimum0
Maximum1256
Zeros1558
Zeros (%)3.4%
Negative0
Negative (%)0.0%
Memory size355.3 KiB
2023-05-15T15:13:49.535476image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile11
Q185
median95
Q3107
95-th percentile138
Maximum1256
Range1256
Interquartile range (IQR)22

Descriptive statistics

Standard deviation38.40781
Coefficient of variation (CV)0.40803724
Kurtosis93.217158
Mean94.128199
Median Absolute Deviation (MAD)11
Skewness4.4659579
Sum4254877
Variance1475.1599
MonotonicityNot monotonic
2023-05-15T15:13:50.375944image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90 2556
 
5.6%
0 1558
 
3.4%
100 1470
 
3.2%
95 1412
 
3.1%
93 1214
 
2.7%
96 1104
 
2.4%
92 1080
 
2.4%
94 1062
 
2.3%
91 1057
 
2.3%
88 1032
 
2.3%
Other values (343) 31658
69.6%
ValueCountFrequency (%)
0 1558
3.4%
1 107
 
0.2%
2 33
 
0.1%
3 48
 
0.1%
4 51
 
0.1%
5 51
 
0.1%
6 72
 
0.2%
7 103
 
0.2%
8 78
 
0.2%
9 63
 
0.1%
ValueCountFrequency (%)
1256 1
< 0.1%
1140 2
< 0.1%
931 1
< 0.1%
925 1
< 0.1%
900 1
< 0.1%
877 1
< 0.1%
874 1
< 0.1%
840 2
< 0.1%
780 1
< 0.1%
720 1
< 0.1%

spoken_languages
Categorical

HIGH CARDINALITY  IMBALANCE  MISSING 

Distinct1841
Distinct (%)4.4%
Missing3955
Missing (%)8.7%
Memory size355.3 KiB
English
22395 
Français
 
1853
日本語
 
1289
Italiano
 
1218
Español
 
902
Other values (1836)
13851 

Length

Max length171
Median length7
Mean length9.3972728
Min length2

Characters and Unicode

Total characters390062
Distinct characters171
Distinct categories8 ?
Distinct scripts15 ?
Distinct blocks16 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1293 ?
Unique (%)3.1%

Sample

1st rowEnglish
2nd rowEnglish, Français
3rd rowEnglish
4th rowEnglish
5th rowEnglish

Common Values

ValueCountFrequency (%)
English 22395
49.3%
Français 1853
 
4.1%
日本語 1289
 
2.8%
Italiano 1218
 
2.7%
Español 902
 
2.0%
Pусский 807
 
1.8%
Deutsch 762
 
1.7%
English, Français 681
 
1.5%
English, Español 572
 
1.3%
हिन्दी 481
 
1.1%
Other values (1831) 10548
23.2%
(Missing) 3955
 
8.7%

Length

2023-05-15T15:13:51.885392image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
english 28745
52.8%
français 4196
 
7.7%
deutsch 2625
 
4.8%
español 2413
 
4.4%
italiano 2367
 
4.4%
日本語 1758
 
3.2%
pусский 1563
 
2.9%
普通话 790
 
1.5%
हिन्दी 707
 
1.3%
663
 
1.2%
Other values (69) 8569
 
15.8%

Most occurring characters

ValueCountFrequency (%)
s 42291
10.8%
n 37482
 
9.6%
i 37129
 
9.5%
l 34650
 
8.9%
h 31476
 
8.1%
E 31215
 
8.0%
g 30430
 
7.8%
a 18957
 
4.9%
13082
 
3.4%
, 11669
 
3.0%
Other values (161) 101681
26.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 292184
74.9%
Uppercase Letter 46453
 
11.9%
Other Letter 22196
 
5.7%
Space Separator 13082
 
3.4%
Other Punctuation 12734
 
3.3%
Spacing Mark 1838
 
0.5%
Nonspacing Mark 1549
 
0.4%
Control 26
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 42291
14.5%
n 37482
12.8%
i 37129
12.7%
l 34650
11.9%
h 31476
10.8%
g 30430
10.4%
a 18957
6.5%
o 7055
 
2.4%
r 6132
 
2.1%
t 5980
 
2.0%
Other values (63) 40602
13.9%
Other Letter
ValueCountFrequency (%)
1758
 
7.9%
1758
 
7.9%
1758
 
7.9%
1263
 
5.7%
946
 
4.3%
790
 
3.6%
790
 
3.6%
707
 
3.2%
707
 
3.2%
707
 
3.2%
Other values (46) 11012
49.6%
Uppercase Letter
ValueCountFrequency (%)
E 31215
67.2%
F 4198
 
9.0%
D 2927
 
6.3%
P 2678
 
5.8%
I 2367
 
5.1%
N 830
 
1.8%
L 506
 
1.1%
M 363
 
0.8%
T 308
 
0.7%
Č 284
 
0.6%
Other values (13) 777
 
1.7%
Spacing Mark
ValueCountFrequency (%)
707
38.5%
ि 707
38.5%
136
 
7.4%
ி 111
 
6.0%
94
 
5.1%
47
 
2.6%
18
 
1.0%
18
 
1.0%
Nonspacing Mark
ValueCountFrequency (%)
707
45.6%
ִ 430
27.8%
ְ 215
 
13.9%
111
 
7.2%
68
 
4.4%
18
 
1.2%
Other Punctuation
ValueCountFrequency (%)
, 11669
91.6%
/ 1015
 
8.0%
? 50
 
0.4%
Space Separator
ValueCountFrequency (%)
13082
100.0%
Control
ValueCountFrequency (%)
š 26
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 326242
83.6%
Common 25842
 
6.6%
Han 10482
 
2.7%
Cyrillic 10460
 
2.7%
Devanagari 4242
 
1.1%
Arabic 3349
 
0.9%
Hangul 3252
 
0.8%
Hebrew 1720
 
0.4%
Greek 1704
 
0.4%
Thai 1232
 
0.3%
Other values (5) 1537
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 42291
13.0%
n 37482
11.5%
i 37129
11.4%
l 34650
10.6%
h 31476
9.6%
E 31215
9.6%
g 30430
9.3%
a 18957
 
5.8%
o 7055
 
2.2%
r 6132
 
1.9%
Other values (50) 49425
15.1%
Cyrillic
ValueCountFrequency (%)
с 3213
30.7%
к 1735
16.6%
и 1680
16.1%
й 1616
15.4%
у 1565
15.0%
а 113
 
1.1%
р 87
 
0.8%
У 53
 
0.5%
ї 53
 
0.5%
н 53
 
0.5%
Other values (12) 292
 
2.8%
Arabic
ValueCountFrequency (%)
ا 538
16.1%
ر 538
16.1%
ة 341
10.2%
ي 341
10.2%
ع 341
10.2%
ل 341
10.2%
ب 341
10.2%
ی 142
 
4.2%
ف 142
 
4.2%
س 142
 
4.2%
Other values (5) 142
 
4.2%
Han
ValueCountFrequency (%)
1758
16.8%
1758
16.8%
1758
16.8%
1263
12.0%
946
9.0%
790
7.5%
790
7.5%
473
 
4.5%
广 473
 
4.5%
473
 
4.5%
Hebrew
ValueCountFrequency (%)
ִ 430
25.0%
ת 215
12.5%
י 215
12.5%
ר 215
12.5%
ְ 215
12.5%
ב 215
12.5%
ע 215
12.5%
Greek
ValueCountFrequency (%)
λ 426
25.0%
ά 213
12.5%
κ 213
12.5%
ι 213
12.5%
ν 213
12.5%
η 213
12.5%
ε 213
12.5%
Georgian
ValueCountFrequency (%)
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
Devanagari
ValueCountFrequency (%)
707
16.7%
707
16.7%
707
16.7%
707
16.7%
707
16.7%
ि 707
16.7%
Hangul
ValueCountFrequency (%)
542
16.7%
542
16.7%
542
16.7%
542
16.7%
542
16.7%
542
16.7%
Thai
ValueCountFrequency (%)
352
28.6%
176
14.3%
176
14.3%
176
14.3%
176
14.3%
176
14.3%
Gurmukhi
ValueCountFrequency (%)
18
16.7%
18
16.7%
18
16.7%
18
16.7%
18
16.7%
18
16.7%
Common
ValueCountFrequency (%)
13082
50.6%
, 11669
45.2%
/ 1015
 
3.9%
? 50
 
0.2%
š 26
 
0.1%
Telugu
ValueCountFrequency (%)
136
33.3%
68
16.7%
68
16.7%
68
16.7%
68
16.7%
Tamil
ValueCountFrequency (%)
111
20.0%
ி 111
20.0%
111
20.0%
111
20.0%
111
20.0%
Bengali
ValueCountFrequency (%)
94
40.0%
47
20.0%
47
20.0%
47
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 343224
88.0%
CJK 10482
 
2.7%
Cyrillic 10460
 
2.7%
None 10438
 
2.7%
Devanagari 4242
 
1.1%
Arabic 3349
 
0.9%
Hangul 3252
 
0.8%
Hebrew 1720
 
0.4%
Thai 1232
 
0.3%
Tamil 555
 
0.1%
Other values (6) 1108
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 42291
12.3%
n 37482
10.9%
i 37129
10.8%
l 34650
10.1%
h 31476
9.2%
E 31215
9.1%
g 30430
8.9%
a 18957
 
5.5%
13082
 
3.8%
, 11669
 
3.4%
Other values (38) 54843
16.0%
None
ValueCountFrequency (%)
ç 4443
42.6%
ñ 2413
23.1%
ê 591
 
5.7%
λ 426
 
4.1%
ý 284
 
2.7%
Č 284
 
2.7%
ü 247
 
2.4%
ά 213
 
2.0%
κ 213
 
2.0%
ι 213
 
2.0%
Other values (11) 1111
 
10.6%
Cyrillic
ValueCountFrequency (%)
с 3213
30.7%
к 1735
16.6%
и 1680
16.1%
й 1616
15.4%
у 1565
15.0%
а 113
 
1.1%
р 87
 
0.8%
У 53
 
0.5%
ї 53
 
0.5%
н 53
 
0.5%
Other values (12) 292
 
2.8%
CJK
ValueCountFrequency (%)
1758
16.8%
1758
16.8%
1758
16.8%
1263
12.0%
946
9.0%
790
7.5%
790
7.5%
473
 
4.5%
广 473
 
4.5%
473
 
4.5%
Devanagari
ValueCountFrequency (%)
707
16.7%
707
16.7%
707
16.7%
707
16.7%
707
16.7%
ि 707
16.7%
Hangul
ValueCountFrequency (%)
542
16.7%
542
16.7%
542
16.7%
542
16.7%
542
16.7%
542
16.7%
Arabic
ValueCountFrequency (%)
ا 538
16.1%
ر 538
16.1%
ة 341
10.2%
ي 341
10.2%
ع 341
10.2%
ل 341
10.2%
ب 341
10.2%
ی 142
 
4.2%
ف 142
 
4.2%
س 142
 
4.2%
Other values (5) 142
 
4.2%
Hebrew
ValueCountFrequency (%)
ִ 430
25.0%
ת 215
12.5%
י 215
12.5%
ר 215
12.5%
ְ 215
12.5%
ב 215
12.5%
ע 215
12.5%
Thai
ValueCountFrequency (%)
352
28.6%
176
14.3%
176
14.3%
176
14.3%
176
14.3%
176
14.3%
Telugu
ValueCountFrequency (%)
136
33.3%
68
16.7%
68
16.7%
68
16.7%
68
16.7%
Tamil
ValueCountFrequency (%)
111
20.0%
ி 111
20.0%
111
20.0%
111
20.0%
111
20.0%
Bengali
ValueCountFrequency (%)
94
40.0%
47
20.0%
47
20.0%
47
20.0%
Latin Ext Additional
ValueCountFrequency (%)
ế 61
50.0%
61
50.0%
Georgian
ValueCountFrequency (%)
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
Gurmukhi
ValueCountFrequency (%)
18
16.7%
18
16.7%
18
16.7%
18
16.7%
18
16.7%
18
16.7%
IPA Ext
ValueCountFrequency (%)
ə 4
100.0%

status
Categorical

Distinct6
Distinct (%)< 0.1%
Missing84
Missing (%)0.2%
Memory size355.3 KiB
Released
45014 
Rumored
 
230
Post Production
 
98
In Production
 
20
Planned
 
15

Length

Max length15
Median length8
Mean length8.0119218
Min length7

Characters and Unicode

Total characters363573
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowReleased
2nd rowReleased
3rd rowReleased
4th rowReleased
5th rowReleased

Common Values

ValueCountFrequency (%)
Released 45014
99.0%
Rumored 230
 
0.5%
Post Production 98
 
0.2%
In Production 20
 
< 0.1%
Planned 15
 
< 0.1%
Canceled 2
 
< 0.1%
(Missing) 84
 
0.2%

Length

2023-05-15T15:13:52.684268image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-15T15:13:53.436853image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
released 45014
98.9%
rumored 230
 
0.5%
production 118
 
0.3%
post 98
 
0.2%
in 20
 
< 0.1%
planned 15
 
< 0.1%
canceled 2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 135291
37.2%
d 45379
 
12.5%
R 45244
 
12.4%
s 45112
 
12.4%
l 45031
 
12.4%
a 45031
 
12.4%
o 564
 
0.2%
r 348
 
0.1%
u 348
 
0.1%
P 231
 
0.1%
Other values (8) 994
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 317958
87.5%
Uppercase Letter 45497
 
12.5%
Space Separator 118
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 135291
42.5%
d 45379
 
14.3%
s 45112
 
14.2%
l 45031
 
14.2%
a 45031
 
14.2%
o 564
 
0.2%
r 348
 
0.1%
u 348
 
0.1%
m 230
 
0.1%
t 216
 
0.1%
Other values (3) 408
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
R 45244
99.4%
P 231
 
0.5%
I 20
 
< 0.1%
C 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
118
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 363455
> 99.9%
Common 118
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 135291
37.2%
d 45379
 
12.5%
R 45244
 
12.4%
s 45112
 
12.4%
l 45031
 
12.4%
a 45031
 
12.4%
o 564
 
0.2%
r 348
 
0.1%
u 348
 
0.1%
P 231
 
0.1%
Other values (7) 876
 
0.2%
Common
ValueCountFrequency (%)
118
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 363573
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 135291
37.2%
d 45379
 
12.5%
R 45244
 
12.4%
s 45112
 
12.4%
l 45031
 
12.4%
a 45031
 
12.4%
o 564
 
0.2%
r 348
 
0.1%
u 348
 
0.1%
P 231
 
0.1%
Other values (8) 994
 
0.3%

tagline
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct20283
Distinct (%)99.4%
Missing25051
Missing (%)55.1%
Memory size355.3 KiB
Based on a true story.
 
7
Trust no one.
 
4
Be careful what you wish for.
 
4
-
 
4
Classic Albums
 
3
Other values (20278)
20390 

Length

Max length297
Median length204
Mean length47.002841
Min length1

Characters and Unicode

Total characters959422
Distinct characters170
Distinct categories17 ?
Distinct scripts6 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20177 ?
Unique (%)98.8%

Sample

1st rowRoll the dice and unleash the excitement!
2nd rowStill Yelling. Still Fighting. Still Ready for Love.
3rd rowFriends are the people who let you be yourself... and never let you forget it.
4th rowJust When His World Is Back To Normal... He's In For The Surprise Of His Life!
5th rowA Los Angeles Crime Saga

Common Values

ValueCountFrequency (%)
Based on a true story. 7
 
< 0.1%
Trust no one. 4
 
< 0.1%
Be careful what you wish for. 4
 
< 0.1%
- 4
 
< 0.1%
Classic Albums 3
 
< 0.1%
Some doors should never be opened. 3
 
< 0.1%
A Love Story 3
 
< 0.1%
Drama 3
 
< 0.1%
Know Your Enemy 3
 
< 0.1%
Which one is the first to return - memory or the murderer? 3
 
< 0.1%
Other values (20273) 20375
44.8%
(Missing) 25051
55.1%

Length

2023-05-15T15:13:54.234110image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 11004
 
6.3%
a 6820
 
3.9%
of 4406
 
2.5%
to 3586
 
2.1%
is 2800
 
1.6%
in 2693
 
1.5%
and 2686
 
1.5%
you 2389
 
1.4%
1585
 
0.9%
for 1524
 
0.9%
Other values (15108) 134566
77.3%

Most occurring characters

ValueCountFrequency (%)
153795
16.0%
e 94486
 
9.8%
t 57309
 
6.0%
o 56611
 
5.9%
a 51521
 
5.4%
n 47539
 
5.0%
i 46086
 
4.8%
r 45029
 
4.7%
s 42399
 
4.4%
h 37192
 
3.9%
Other values (160) 327455
34.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 681040
71.0%
Space Separator 153795
 
16.0%
Uppercase Letter 75028
 
7.8%
Other Punctuation 44604
 
4.6%
Decimal Number 2687
 
0.3%
Dash Punctuation 1948
 
0.2%
Final Punctuation 98
 
< 0.1%
Open Punctuation 56
 
< 0.1%
Close Punctuation 55
 
< 0.1%
Currency Symbol 37
 
< 0.1%
Other values (7) 74
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 94486
13.9%
t 57309
 
8.4%
o 56611
 
8.3%
a 51521
 
7.6%
n 47539
 
7.0%
i 46086
 
6.8%
r 45029
 
6.6%
s 42399
 
6.2%
h 37192
 
5.5%
l 30199
 
4.4%
Other values (43) 172669
25.4%
Other Letter
ValueCountFrequency (%)
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
Other values (24) 24
70.6%
Uppercase Letter
ValueCountFrequency (%)
T 10013
 
13.3%
A 6878
 
9.2%
S 5653
 
7.5%
H 4404
 
5.9%
I 4387
 
5.8%
E 4307
 
5.7%
W 3683
 
4.9%
O 3479
 
4.6%
L 3196
 
4.3%
N 3196
 
4.3%
Other values (20) 25832
34.4%
Other Punctuation
ValueCountFrequency (%)
. 26655
59.8%
! 5785
 
13.0%
' 5676
 
12.7%
, 4231
 
9.5%
? 1161
 
2.6%
" 582
 
1.3%
148
 
0.3%
: 138
 
0.3%
& 84
 
0.2%
* 42
 
0.1%
Other values (7) 102
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 802
29.8%
1 516
19.2%
2 299
 
11.1%
3 208
 
7.7%
9 208
 
7.7%
5 168
 
6.3%
4 140
 
5.2%
7 121
 
4.5%
6 121
 
4.5%
8 104
 
3.9%
Math Symbol
ValueCountFrequency (%)
= 5
35.7%
+ 5
35.7%
| 2
 
14.3%
~ 1
 
7.1%
1
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 1931
99.1%
9
 
0.5%
8
 
0.4%
Final Punctuation
ValueCountFrequency (%)
82
83.7%
15
 
15.3%
» 1
 
1.0%
Initial Punctuation
ValueCountFrequency (%)
14
73.7%
4
 
21.1%
« 1
 
5.3%
Open Punctuation
ValueCountFrequency (%)
( 49
87.5%
[ 7
 
12.5%
Close Punctuation
ValueCountFrequency (%)
) 48
87.3%
] 7
 
12.7%
Other Number
ValueCountFrequency (%)
½ 2
66.7%
² 1
33.3%
Modifier Letter
ValueCountFrequency (%)
ˌ 1
50.0%
ˈ 1
50.0%
Space Separator
ValueCountFrequency (%)
153795
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 37
100.0%
Nonspacing Mark
ValueCountFrequency (%)
1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 756068
78.8%
Common 203319
 
21.2%
Han 21
 
< 0.1%
Tamil 5
 
< 0.1%
Hiragana 5
 
< 0.1%
Katakana 4
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 94486
 
12.5%
t 57309
 
7.6%
o 56611
 
7.5%
a 51521
 
6.8%
n 47539
 
6.3%
i 46086
 
6.1%
r 45029
 
6.0%
s 42399
 
5.6%
h 37192
 
4.9%
l 30199
 
4.0%
Other values (73) 247697
32.8%
Common
ValueCountFrequency (%)
153795
75.6%
. 26655
 
13.1%
! 5785
 
2.8%
' 5676
 
2.8%
, 4231
 
2.1%
- 1931
 
0.9%
? 1161
 
0.6%
0 802
 
0.4%
" 582
 
0.3%
1 516
 
0.3%
Other values (42) 2185
 
1.1%
Han
ValueCountFrequency (%)
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (11) 11
52.4%
Tamil
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Hiragana
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Katakana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 958992
> 99.9%
Punctuation 280
 
< 0.1%
None 110
 
< 0.1%
CJK 21
 
< 0.1%
Tamil 5
 
< 0.1%
Hiragana 5
 
< 0.1%
Katakana 4
 
< 0.1%
IPA Ext 2
 
< 0.1%
Modifier Letters 2
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
153795
16.0%
e 94486
 
9.9%
t 57309
 
6.0%
o 56611
 
5.9%
a 51521
 
5.4%
n 47539
 
5.0%
i 46086
 
4.8%
r 45029
 
4.7%
s 42399
 
4.4%
h 37192
 
3.9%
Other values (78) 327025
34.1%
Punctuation
ValueCountFrequency (%)
148
52.9%
82
29.3%
15
 
5.4%
14
 
5.0%
9
 
3.2%
8
 
2.9%
4
 
1.4%
None
ValueCountFrequency (%)
é 18
16.4%
ä 16
14.5%
ö 8
 
7.3%
á 6
 
5.5%
ó 6
 
5.5%
í 5
 
4.5%
ü 5
 
4.5%
ı 5
 
4.5%
· 4
 
3.6%
ñ 3
 
2.7%
Other values (26) 34
30.9%
IPA Ext
ValueCountFrequency (%)
ə 2
100.0%
CJK
ValueCountFrequency (%)
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (11) 11
52.4%
Tamil
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Modifier Letters
ValueCountFrequency (%)
ˌ 1
50.0%
ˈ 1
50.0%
Katakana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Hiragana
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

title
Categorical

HIGH CARDINALITY  UNIFORM 

Distinct42277
Distinct (%)93.0%
Missing3
Missing (%)< 0.1%
Memory size355.3 KiB
Cinderella
 
11
Hamlet
 
9
Alice in Wonderland
 
9
Les Misérables
 
8
Beauty and the Beast
 
8
Other values (42272)
45415 

Length

Max length105
Median length79
Mean length16.708535
Min length1

Characters and Unicode

Total characters759570
Distinct characters287
Distinct categories17 ?
Distinct scripts7 ?
Distinct blocks12 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39947 ?
Unique (%)87.9%

Sample

1st rowToy Story
2nd rowJumanji
3rd rowGrumpier Old Men
4th rowWaiting to Exhale
5th rowFather of the Bride Part II

Common Values

ValueCountFrequency (%)
Cinderella 11
 
< 0.1%
Hamlet 9
 
< 0.1%
Alice in Wonderland 9
 
< 0.1%
Les Misérables 8
 
< 0.1%
Beauty and the Beast 8
 
< 0.1%
Treasure Island 7
 
< 0.1%
Blackout 7
 
< 0.1%
A Christmas Carol 7
 
< 0.1%
The Three Musketeers 7
 
< 0.1%
The Hunters 6
 
< 0.1%
Other values (42267) 45381
99.8%

Length

2023-05-15T15:13:55.089256image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 14571
 
10.7%
of 4938
 
3.6%
a 2244
 
1.6%
in 1697
 
1.2%
and 1634
 
1.2%
to 1055
 
0.8%
763
 
0.6%
man 665
 
0.5%
love 664
 
0.5%
for 602
 
0.4%
Other values (24431) 107634
78.9%

Most occurring characters

ValueCountFrequency (%)
91029
 
12.0%
e 76408
 
10.1%
a 49056
 
6.5%
o 45765
 
6.0%
n 40931
 
5.4%
r 40096
 
5.3%
i 39859
 
5.2%
t 36792
 
4.8%
s 29591
 
3.9%
h 28564
 
3.8%
Other values (277) 281479
37.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 535372
70.5%
Uppercase Letter 117493
 
15.5%
Space Separator 91029
 
12.0%
Other Punctuation 10513
 
1.4%
Decimal Number 3863
 
0.5%
Dash Punctuation 986
 
0.1%
Close Punctuation 87
 
< 0.1%
Open Punctuation 85
 
< 0.1%
Final Punctuation 38
 
< 0.1%
Other Letter 25
 
< 0.1%
Other values (7) 79
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 76408
14.3%
a 49056
9.2%
o 45765
 
8.5%
n 40931
 
7.6%
r 40096
 
7.5%
i 39859
 
7.4%
t 36792
 
6.9%
s 29591
 
5.5%
h 28564
 
5.3%
l 25992
 
4.9%
Other values (121) 122318
22.8%
Uppercase Letter
ValueCountFrequency (%)
T 16037
13.6%
S 10354
 
8.8%
M 8042
 
6.8%
B 7674
 
6.5%
C 7175
 
6.1%
A 6808
 
5.8%
D 6355
 
5.4%
L 5883
 
5.0%
H 5183
 
4.4%
W 5175
 
4.4%
Other values (65) 38807
33.0%
Other Letter
ValueCountFrequency (%)
ی 2
 
8.0%
ک 2
 
8.0%
چ 2
 
8.0%
ه 2
 
8.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
ª 1
 
4.0%
ا 1
 
4.0%
Other values (11) 11
44.0%
Other Punctuation
ValueCountFrequency (%)
: 3727
35.5%
' 2512
23.9%
. 1604
15.3%
, 1136
 
10.8%
! 648
 
6.2%
& 460
 
4.4%
? 269
 
2.6%
/ 80
 
0.8%
* 19
 
0.2%
# 13
 
0.1%
Other values (8) 45
 
0.4%
Decimal Number
ValueCountFrequency (%)
2 864
22.4%
1 699
18.1%
0 619
16.0%
3 484
12.5%
9 230
 
6.0%
4 229
 
5.9%
5 225
 
5.8%
7 196
 
5.1%
8 161
 
4.2%
6 156
 
4.0%
Math Symbol
ValueCountFrequency (%)
+ 17
70.8%
× 3
 
12.5%
= 1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
Other Number
ValueCountFrequency (%)
½ 12
63.2%
² 3
 
15.8%
³ 2
 
10.5%
1
 
5.3%
1
 
5.3%
Other Symbol
ValueCountFrequency (%)
° 3
37.5%
2
25.0%
1
 
12.5%
1
 
12.5%
1
 
12.5%
Currency Symbol
ValueCountFrequency (%)
$ 18
85.7%
¢ 2
 
9.5%
£ 1
 
4.8%
Dash Punctuation
ValueCountFrequency (%)
- 971
98.5%
15
 
1.5%
Close Punctuation
ValueCountFrequency (%)
) 82
94.3%
] 5
 
5.7%
Open Punctuation
ValueCountFrequency (%)
( 80
94.1%
[ 5
 
5.9%
Final Punctuation
ValueCountFrequency (%)
37
97.4%
1
 
2.6%
Initial Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
91029
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Format
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 652335
85.9%
Common 106680
 
14.0%
Cyrillic 361
 
< 0.1%
Greek 170
 
< 0.1%
Arabic 11
 
< 0.1%
Katakana 8
 
< 0.1%
Han 5
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 76408
 
11.7%
a 49056
 
7.5%
o 45765
 
7.0%
n 40931
 
6.3%
r 40096
 
6.1%
i 39859
 
6.1%
t 36792
 
5.6%
s 29591
 
4.5%
h 28564
 
4.4%
l 25992
 
4.0%
Other values (107) 239281
36.7%
Common
ValueCountFrequency (%)
91029
85.3%
: 3727
 
3.5%
' 2512
 
2.4%
. 1604
 
1.5%
, 1136
 
1.1%
- 971
 
0.9%
2 864
 
0.8%
1 699
 
0.7%
! 648
 
0.6%
0 619
 
0.6%
Other values (50) 2871
 
2.7%
Cyrillic
ValueCountFrequency (%)
е 33
 
9.1%
о 32
 
8.9%
а 32
 
8.9%
н 26
 
7.2%
и 24
 
6.6%
р 23
 
6.4%
к 17
 
4.7%
в 16
 
4.4%
с 15
 
4.2%
л 14
 
3.9%
Other values (38) 129
35.7%
Greek
ValueCountFrequency (%)
α 20
 
11.8%
ι 14
 
8.2%
ο 14
 
8.2%
τ 9
 
5.3%
λ 8
 
4.7%
ρ 8
 
4.7%
ά 8
 
4.7%
ν 7
 
4.1%
ε 6
 
3.5%
π 6
 
3.5%
Other values (32) 70
41.2%
Katakana
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Arabic
ValueCountFrequency (%)
ی 2
18.2%
ک 2
18.2%
چ 2
18.2%
ه 2
18.2%
ا 1
9.1%
س 1
9.1%
ج 1
9.1%
Han
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 757982
99.8%
None 1132
 
0.1%
Cyrillic 361
 
< 0.1%
Punctuation 62
 
< 0.1%
Arabic 11
 
< 0.1%
Katakana 8
 
< 0.1%
CJK 5
 
< 0.1%
Misc Symbols 3
 
< 0.1%
Letterlike Symbols 2
 
< 0.1%
Math Operators 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
91029
 
12.0%
e 76408
 
10.1%
a 49056
 
6.5%
o 45765
 
6.0%
n 40931
 
5.4%
r 40096
 
5.3%
i 39859
 
5.3%
t 36792
 
4.9%
s 29591
 
3.9%
h 28564
 
3.8%
Other values (76) 279891
36.9%
None
ValueCountFrequency (%)
é 218
19.3%
ä 128
 
11.3%
ö 56
 
4.9%
è 54
 
4.8%
ô 44
 
3.9%
ü 39
 
3.4%
ó 37
 
3.3%
á 35
 
3.1%
ı 35
 
3.1%
à 33
 
2.9%
Other values (108) 453
40.0%
Punctuation
ValueCountFrequency (%)
37
59.7%
15
24.2%
5
 
8.1%
2
 
3.2%
1
 
1.6%
1
 
1.6%
1
 
1.6%
Cyrillic
ValueCountFrequency (%)
е 33
 
9.1%
о 32
 
8.9%
а 32
 
8.9%
н 26
 
7.2%
и 24
 
6.6%
р 23
 
6.4%
к 17
 
4.7%
в 16
 
4.4%
с 15
 
4.2%
л 14
 
3.9%
Other values (38) 129
35.7%
Arabic
ValueCountFrequency (%)
ی 2
18.2%
ک 2
18.2%
چ 2
18.2%
ه 2
18.2%
ا 1
9.1%
س 1
9.1%
ج 1
9.1%
Misc Symbols
ValueCountFrequency (%)
2
66.7%
1
33.3%
CJK
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Letterlike Symbols
ValueCountFrequency (%)
1
50.0%
1
50.0%
Katakana
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Math Operators
ValueCountFrequency (%)
1
50.0%
1
50.0%
Arrows
ValueCountFrequency (%)
1
100.0%

vote_average
Real number (ℝ)

Distinct92
Distinct (%)0.2%
Missing3
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean5.6182072
Minimum0
Maximum10
Zeros2998
Zeros (%)6.6%
Negative0
Negative (%)0.0%
Memory size355.3 KiB
2023-05-15T15:13:55.879348image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median6
Q36.8
95-th percentile7.8
Maximum10
Range10
Interquartile range (IQR)1.8

Descriptive statistics

Standard deviation1.924216
Coefficient of variation (CV)0.34249644
Kurtosis2.5004022
Mean5.6182072
Median Absolute Deviation (MAD)0.9
Skewness-1.5189901
Sum255403.7
Variance3.7026072
MonotonicityNot monotonic
2023-05-15T15:13:56.639767image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2998
 
6.6%
6 2468
 
5.4%
5 2001
 
4.4%
7 1886
 
4.1%
6.5 1722
 
3.8%
6.3 1603
 
3.5%
5.5 1381
 
3.0%
5.8 1369
 
3.0%
6.4 1350
 
3.0%
6.7 1342
 
3.0%
Other values (82) 27340
60.1%
ValueCountFrequency (%)
0 2998
6.6%
0.5 13
 
< 0.1%
0.7 1
 
< 0.1%
1 105
 
0.2%
1.1 1
 
< 0.1%
1.2 4
 
< 0.1%
1.3 13
 
< 0.1%
1.4 5
 
< 0.1%
1.5 30
 
0.1%
1.6 6
 
< 0.1%
ValueCountFrequency (%)
10 190
0.4%
9.8 1
 
< 0.1%
9.6 1
 
< 0.1%
9.5 18
 
< 0.1%
9.4 3
 
< 0.1%
9.3 18
 
< 0.1%
9.2 4
 
< 0.1%
9.1 3
 
< 0.1%
9 159
0.3%
8.9 7
 
< 0.1%

return
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct5232
Distinct (%)11.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean658.7797
Minimum0
Maximum12396383
Zeros40082
Zeros (%)88.2%
Negative0
Negative (%)0.0%
Memory size355.3 KiB
2023-05-15T15:13:57.394737image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2.5325821
Maximum12396383
Range12396383
Interquartile range (IQR)0

Descriptive statistics

Standard deviation74621.796
Coefficient of variation (CV)113.27276
Kurtosis20712.598
Mean658.7797
Median Absolute Deviation (MAD)0
Skewness138.46208
Sum29950101
Variance5.5684124 × 109
MonotonicityNot monotonic
2023-05-15T15:13:58.245104image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 40082
88.2%
1 20
 
< 0.1%
2 12
 
< 0.1%
4 11
 
< 0.1%
5 8
 
< 0.1%
3 7
 
< 0.1%
2.5 7
 
< 0.1%
1.333333333 7
 
< 0.1%
1.5 6
 
< 0.1%
7 4
 
< 0.1%
Other values (5222) 5299
 
11.7%
ValueCountFrequency (%)
0 40082
88.2%
5.217391304 × 10-71
 
< 0.1%
7.5 × 10-71
 
< 0.1%
9.375 × 10-71
 
< 0.1%
1.499133126 × 10-61
 
< 0.1%
1.8 × 10-61
 
< 0.1%
1.916666667 × 10-61
 
< 0.1%
3.5 × 10-61
 
< 0.1%
4 × 10-61
 
< 0.1%
5.111111111 × 10-61
 
< 0.1%
ValueCountFrequency (%)
12396383 1
< 0.1%
8500000 1
< 0.1%
4197476.625 1
< 0.1%
2755584 1
< 0.1%
1018619.283 1
< 0.1%
1000000 1
< 0.1%
26881.72043 1
< 0.1%
12890.38667 1
< 0.1%
5330.33945 1
< 0.1%
4133.333333 1
< 0.1%

release_year
Real number (ℝ)

Distinct136
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1988.0694
Minimum0
Maximum2020
Zeros87
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size355.3 KiB
2023-05-15T15:13:59.063651image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1940
Q11978
median2001
Q32010
95-th percentile2015
Maximum2020
Range2020
Interquartile range (IQR)32

Descriptive statistics

Standard deviation90.309172
Coefficient of variation (CV)0.045425562
Kurtosis446.51295
Mean1988.0694
Median Absolute Deviation (MAD)12
Skewness-20.430919
Sum90383601
Variance8155.7466
MonotonicityNot monotonic
2023-05-15T15:13:59.948141image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2014 1974
 
4.3%
2015 1905
 
4.2%
2013 1889
 
4.2%
2012 1722
 
3.8%
2011 1667
 
3.7%
2016 1604
 
3.5%
2009 1586
 
3.5%
2010 1501
 
3.3%
2008 1473
 
3.2%
2007 1320
 
2.9%
Other values (126) 28822
63.4%
ValueCountFrequency (%)
0 87
0.2%
1874 1
 
< 0.1%
1878 1
 
< 0.1%
1883 1
 
< 0.1%
1887 1
 
< 0.1%
1888 2
 
< 0.1%
1890 5
 
< 0.1%
1891 6
 
< 0.1%
1892 3
 
< 0.1%
1893 1
 
< 0.1%
ValueCountFrequency (%)
2020 1
 
< 0.1%
2018 5
 
< 0.1%
2017 532
 
1.2%
2016 1604
3.5%
2015 1905
4.2%
2014 1974
4.3%
2013 1889
4.2%
2012 1722
3.8%
2011 1667
3.7%
2010 1501
3.3%

Interactions

2023-05-15T15:13:23.154203image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:38.539858image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:49.553970image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:55.507358image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:00.676690image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:06.056839image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:11.953565image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:17.024699image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:23.828328image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:40.436244image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:50.530903image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:56.078826image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:01.288119image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:06.904054image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:12.602192image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:17.930554image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:24.358536image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:42.972342image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:51.622565image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:56.706282image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:01.831654image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:07.872297image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:13.207434image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:18.608372image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:25.442334image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:44.422673image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:52.169542image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:57.235478image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:02.415863image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:08.957626image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:13.720916image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:19.375775image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:26.215394image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:45.501345image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:52.700119image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:58.007762image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:03.320033image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:09.597765image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:14.287942image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:20.061078image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:27.093653image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:46.515853image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:53.414268image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:58.832069image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:03.918037image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:10.191357image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:14.864604image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:20.982139image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:27.756784image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:47.636123image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:53.963544image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:59.497646image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:04.588633image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:10.733810image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:15.548383image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:21.825667image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:28.662401image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:48.568348image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:12:54.753099image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:00.107405image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:05.296114image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:11.352360image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:16.222006image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-15T15:13:22.560218image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2023-05-15T15:14:00.662949image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
budgetidpopularityrevenueruntimevote_averagereturnrelease_yearoriginal_languagestatus
budget1.000-0.2560.4630.6440.2270.0720.7750.1420.0000.000
id-0.2561.000-0.412-0.278-0.207-0.151-0.2620.3860.0710.056
popularity0.463-0.4121.0000.4910.3080.2430.4470.1900.0000.000
revenue0.644-0.2780.4911.0000.2540.1270.8530.1050.0000.000
runtime0.227-0.2070.3080.2541.0000.1940.2340.0360.1110.000
vote_average0.072-0.1510.2430.1270.1941.0000.120-0.0060.0700.019
return0.775-0.2620.4470.8530.2340.1201.0000.0880.0000.000
release_year0.1420.3860.1900.1050.036-0.0060.0881.0000.0000.098
original_language0.0000.0710.0000.0000.1110.0700.0000.0001.0000.000
status0.0000.0560.0000.0000.0000.0190.0000.0980.0001.000

Missing values

2023-05-15T15:13:29.958731image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-05-15T15:13:32.164204image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-05-15T15:13:33.862448image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

belongs_to_collectionbudgetgenresidoriginal_languageoverviewpopularityproduction_companiesproduction_countriesrelease_daterevenueruntimespoken_languagesstatustaglinetitlevote_averagereturnrelease_year
0Toy Story Collection30000000.0Animation, Comedy, Family862enLed by Woody, Andy's toys live happily in his room until Andy's birthday brings Buzz Lightyear onto the scene. Afraid of losing his place in Andy's heart, Woody plots against Buzz. But when circumstances separate Buzz and Woody from their owner, the duo eventually learns to put aside their differences.21.946943Pixar Animation StudiosUnited States of America1995-10-30373554033.081.0EnglishReleasedNaNToy Story7.712.4518011995
1NaN65000000.0Adventure, Fantasy, Family8844enWhen siblings Judy and Peter discover an enchanted board game that opens the door to a magical world, they unwittingly invite Alan -- an adult who's been trapped inside the game for 26 years -- into their living room. Alan's only hope for freedom is to finish the game, which proves risky as all three find themselves running from giant rhinoceroses, evil monkeys and other terrifying creatures.17.015539TriStar Pictures, Teitler Film, Interscope CommunicationsUnited States of America1995-12-15262797249.0104.0English, FrançaisReleasedRoll the dice and unleash the excitement!Jumanji6.94.0430351995
2Grumpy Old Men Collection0.0Romance, Comedy15602enA family wedding reignites the ancient feud between next-door neighbors and fishing buddies John and Max. Meanwhile, a sultry Italian divorcée opens a restaurant at the local bait shop, alarming the locals who worry she'll scare the fish away. But she's less interested in seafood than she is in cooking up a hot time with Max.11.712900Warner Bros., Lancaster GateUnited States of America1995-12-220.0101.0EnglishReleasedStill Yelling. Still Fighting. Still Ready for Love.Grumpier Old Men6.50.0000001995
3NaN16000000.0Comedy, Drama, Romance31357enCheated on, mistreated and stepped on, the women are holding their breath, waiting for the elusive "good man" to break a string of less-than-stellar lovers. Friends and confidants Vannah, Bernie, Glo and Robin talk it all out, determined to find a better way to breathe.3.859495Twentieth Century Fox Film CorporationUnited States of America1995-12-2281452156.0127.0EnglishReleasedFriends are the people who let you be yourself... and never let you forget it.Waiting to Exhale6.15.0907601995
4Father of the Bride Collection0.0Comedy11862enJust when George Banks has recovered from his daughter's wedding, he receives the news that she's pregnant ... and that George's wife, Nina, is expecting too. He was planning on selling their home, but that's a plan that -- like George -- will have to change with the arrival of both a grandchild and a kid of his own.8.387519Sandollar Productions, Touchstone PicturesUnited States of America1995-02-1076578911.0106.0EnglishReleasedJust When His World Is Back To Normal... He's In For The Surprise Of His Life!Father of the Bride Part II5.70.0000001995
5NaN60000000.0Action, Crime, Drama, Thriller949enObsessive master thief, Neil McCauley leads a top-notch crew on various insane heists throughout Los Angeles while a mentally unstable detective, Vincent Hanna pursues him without rest. Each man recognizes and respects the ability and the dedication of the other even though they are aware their cat-and-mouse game may end in violence.17.924927Regency Enterprises, Forward Pass, Warner Bros.United States of America1995-12-15187436818.0170.0English, EspañolReleasedA Los Angeles Crime SagaHeat7.73.1239471995
6NaN58000000.0Comedy, Romance11860enAn ugly duckling having undergone a remarkable change, still harbors feelings for her crush: a carefree playboy, but not before his business-focused brother has something to say about it.6.677277Paramount Pictures, Scott Rudin Productions, Mirage Enterprises, Sandollar Productions, Constellation Entertainment, Worldwide, Mont Blanc Entertainment GmbHGermany, United States of America1995-12-150.0127.0Français, EnglishReleasedYou are cordially invited to the most surprising merger of the year.Sabrina6.20.0000001995
7NaN0.0Action, Adventure, Drama, Family45325enA mischievous young boy, Tom Sawyer, witnesses a murder by the deadly Injun Joe. Tom becomes friends with Huckleberry Finn, a boy with no future and no family. Tom has to choose between honoring a friendship or honoring an oath because the town alcoholic is accused of the murder. Tom and Huck go through several adventures trying to retrieve evidence.2.561161Walt Disney PicturesUnited States of America1995-12-220.097.0English, DeutschReleasedThe Original Bad Boys.Tom and Huck5.40.0000001995
8NaN35000000.0Action, Adventure, Thriller9091enInternational action superstar Jean Claude Van Damme teams with Powers Boothe in a Tension-packed, suspense thriller, set against the back-drop of a Stanley Cup game.Van Damme portrays a father whose daughter is suddenly taken during a championship hockey game. With the captors demanding a billion dollars by game's end, Van Damme frantically sets a plan in motion to rescue his daughter and abort an impending explosion before the final buzzer...5.231580Universal Pictures, Imperial Entertainment, Signature EntertainmentUnited States of America1995-12-2264350171.0106.0EnglishReleasedTerror goes into overtime.Sudden Death5.51.8385761995
9James Bond Collection58000000.0Adventure, Action, Thriller710enJames Bond must unmask the mysterious head of the Janus Syndicate and prevent the leader from utilizing the GoldenEye weapons system to inflict devastating revenge on Britain.14.686036United Artists, Eon ProductionsUnited Kingdom, United States of America1995-11-16352194034.0130.0English, Pусский, EspañolReleasedNo limits. No fears. No substitutes.GoldenEye6.66.0723111995
belongs_to_collectionbudgetgenresidoriginal_languageoverviewpopularityproduction_companiesproduction_countriesrelease_daterevenueruntimespoken_languagesstatustaglinetitlevote_averagereturnrelease_year
45453NaN0.0Horror, Mystery, Thriller84419enAn unsuccessful sculptor saves a madman named "The Creeper" from drowning. Seeing an opportunity for revenge, he tricks the psycho into murdering his critics.0.222814Universal PicturesUnited States of America1946-03-290.065.0EnglishReleasedMeet...The CREEPER!House of Horrors6.30.01946
45454NaN0.0Mystery, Horror390959enIn this true-crime documentary, we delve into the murder spree that was the inspiration for Joe Berlinger's "Book of Shadows: Blair Witch 2".0.076061unknownunknown2000-10-220.045.0EnglishReleasedNaNShadow of the Blair Witch7.00.02000
45455NaN0.0Horror289923enA film archivist revisits the story of Rustin Parr, a hermit thought to have murdered seven children while under the possession of the Blair Witch.0.386450Neptune Salad Entertainment, Pirie ProductionsUnited States of America2000-10-030.030.0EnglishReleasedDo you know what happened 50 years before "The Blair Witch Project"?The Burkittsville 77.00.02000
45456NaN0.0Science Fiction222848enIt's the year 3000 AD. The world's most dangerous women are banished to a remote asteroid 45 million light years from earth. Kira Murphy doesn't belong; wrongfully accused of a crime she did not commit, she's thrown in this interplanetary prison and left to her own defenses. But Kira's a fighter, and soon she finds herself in the middle of a female gang war; where everyone wants a piece of the action... and a piece of her! "Caged Heat 3000" takes the Women-in-Prison genre to a whole new level... and a whole new galaxy!0.661558Concorde-New HorizonsUnited States of America1995-01-010.085.0EnglishReleasedNaNCaged Heat 30003.50.01995
45457NaN0.0Drama, Action, Romance30840enYet another version of the classic epic, with enough variation to make it interesting. The story is the same, but some of the characters are quite different from the usual, in particular Uma Thurman's very special maid Marian. The photography is also great, giving the story a somewhat darker tone.5.683753Westdeutscher Rundfunk (WDR), Working Title Films, 20th Century Fox Television, CanWest Global CommunicationsCanada, Germany, United Kingdom, United States of America1991-05-130.0104.0EnglishReleasedNaNRobin Hood5.70.01991
45458NaN0.0Drama, Family439050faRising and falling between a man and woman.0.072051unknownIranNaN0.090.0فارسیReleasedRising and falling between a man and womanSubdue4.00.00
45459NaN0.0Drama111109tlAn artist struggles to finish his work while a storyline about a cult plays in his head.0.178241Sine OliviaPhilippines2011-11-170.0360.0NaNReleasedNaNCentury of Birthing9.00.02011
45460NaN0.0Action, Drama, Thriller67758enWhen one of her hits goes wrong, a professional assassin ends up with a suitcase full of a million dollars belonging to a mob boss ...0.903007American World PicturesUnited States of America2003-08-010.090.0EnglishReleasedA deadly game of wits.Betrayal3.80.02003
45461NaN0.0NaN227506enIn a small town live two brothers, one a minister and the other one a hunchback painter of the chapel who lives with his wife. One dreadful and stormy night, a stranger knocks at the door asking for shelter. The stranger talks about all the good things of the earthly life the minister is missing because of his puritanical faith. The minister comes to accept the stranger's viewpoint but it is others who will pay the consequences because the minister will discover the human pleasures thanks to, ehem, his sister- in -law… The tormented minister and his cuckolded brother will die in a strange accident in the chapel and later an infant will be born from the minister's adulterous relationship.0.003503YermolievRussia1917-10-210.087.0NaNReleasedNaNSatan Triumphant0.00.01917
45462NaN0.0NaN461257en50 years after decriminalisation of homosexuality in the UK, director Daisy Asquith mines the jewels of the BFI archive to take us into the relationships, desires, fears and expressions of gay men and women in the 20th century.0.163015unknownUnited Kingdom2017-06-090.075.0EnglishReleasedNaNQueerama0.00.02017

Duplicate rows

Most frequently occurring

belongs_to_collectionbudgetgenresidoriginal_languageoverviewpopularityproduction_companiesproduction_countriesrelease_daterevenueruntimespoken_languagesstatustaglinetitlevote_averagereturnrelease_year# duplicates
14NaN0.0Thriller, Mystery141971fiRecovering from a nail gun shot to the head and 13 months of coma, doctor Pekka Valinta starts to unravel the mystery of his past, still suffering from total amnesia.0.411949Filmiteollisuus FineFinland2008-12-260.0108.0suomiReleasedWhich one is the first to return - memory or the murderer?Blackout6.70.020083
0Why We Fight0.0Documentary159849enThe third film of Frank Capra's 'Why We Fight" propaganda film series, dealing with the Nazi conquest of Western Europe in 1940.0.473322unknownUnited States of America1943-01-010.057.0EnglishReleasedNaNWhy We Fight: Divide and Conquer5.00.019432
1NaN0.0Action, Drama, Romance, Adventure99080enOriginally called White Thunder, American producer Varick Frissell's 1931 film was inspired by his love for the Canadian Arctic Circle. Set in a beautifully black-and-white filmed Newfoundland, it is the story of a rivalry between two seal hunters that plays out on the ice floes during a hunt. Unsatisfied with the first cut, Frissell arranged for the crew to accompany an actual Newfoundland seal hunt on The SS Viking, on which an explosion of dynamite (carried regularly at the time on Arctic ships to combat ice jams) killed many members of the crew, including Frissell. The film was renamed in honor of the dead.0.002362unknownunknown1931-06-210.070.0EnglishReleasedActually produced during the Great Newfoundland Seal Hunt and You see the REAL thingThe Viking0.00.019312
2NaN0.0Action, Horror, Science Fiction18440enWhen a comet strikes Earth and kicks up a cloud of toxic dust, hundreds of humans join the ranks of the living dead. But there's bad news for the survivors: The newly minted zombies are hell-bent on eradicating every last person from the planet. For the few human beings who remain, going head to head with the flesh-eating fiends is their only chance for long-term survival. Yet their battle will be dark and cold, with overwhelming odds.1.436085unknownUnited States of America2007-01-010.089.0EnglishReleasedNaNDays of Darkness5.00.020072
3NaN0.0Adventure, Animation, Drama, Action, Foreign23305enIn feudal India, a warrior (Khan) who renounces his role as the longtime enforcer to a local lord becomes the prey in a murderous hunt through the Himalayan mountains.1.967992FilmfourFrance, Germany, India, United Kingdom2001-09-230.086.0हिन्दीReleasedNaNThe Warrior6.30.020012
4NaN0.0Comedy97995enAfter breaking a mirror in his home, superstitious Max tries to avoid situations which could bring bad luck but in doing so, causes himself the worst luck imaginable.0.141558Max Linder ProductionsUnited States of America1921-02-060.062.0EnglishReleasedNaNSeven Years Bad Luck5.60.019212
5NaN0.0Comedy, Drama11115enAs an ex-gambler teaches a hot-shot college kid some things about playing cards, he finds himself pulled into the world series of poker, where his protégé is his toughest competition.6.880365Andertainment Group, Crescent City Pictures, Tag EntertainmentUnited States of America2008-01-290.085.0EnglishReleasedNaNDeal5.20.020082
6NaN0.0Comedy, Drama265189svWhile holidaying in the French Alps, a Swedish family deals with acts of cowardliness as an avalanche breaks out.12.165685Motlys, Coproduction Office, Film i VästNorway, Sweden, France2014-08-151359497.0118.0Français, Norsk, svenska, EnglishReleasedNaNForce Majeure6.80.020142
7NaN0.0Crime, Drama, Thriller5511frHitman Jef Costello is a perfectionist who always carefully plans his murders and who never gets caught.9.091288Fida cinematografica, Compagnie Industrielle et Commerciale Cinématographique (CICC), TC Productions, FilmelFrance, Italy1967-10-2539481.0105.0FrançaisReleasedThere is no solitude greater than that of the SamuraiLe Samouraï7.90.019672
8NaN0.0Drama25541daFormer Danish servicemen Lars and Jimmy are thrown together while training in a neo-Nazi group. Moving from hostility through grudging admiration to friendship and finally passion, events take a darker turn when their illicit relationship is uncovered.2.587911unknownSweden, Denmark2009-10-210.090.0DanskReleasedNaNBrotherhood7.10.020092